Cylc documentation

The Cylc Suite Engine

Current release: 7.9.3

Released Under the GNU GPL v3.0 Software License

Copyright (C) NIWA & British Crown (Met Office) & Contributors.


Cylc (“silk”) is a workflow engine for cycling systems - it orchestrates distributed suites of interdependent cycling tasks that may continue to run indefinitely.


1. Introduction: How Cylc Works

This section of the user guide is being rewritten for Cylc 8. For the moment we’ve removed some outdated information, leaving just the description of how Cylc manages cycling workflows. For a more up-to-date description see references cited on the Cylc web site.

1.1. Dependence Between Tasks

1.1.1. Intra-cycle Dependence

Most dependence between tasks applies within a single cycle point. Fig. 1 shows the dependency diagram for a single cycle point of a simple example suite of three scientific models (say) (a, b, and c) and three post processing or product generation tasks (d, e and f). A scheduler capable of handling this must manage, within a single cycle point, multiple parallel streams of execution that branch when one task generates output for several downstream tasks, and merge when one task takes input from several upstream tasks.

_images/dep-one-cycle.png

Fig. 1 A single cycle point dependency graph for a simple suite. The dependency graph for a single cycle point of a simple example suite. Tasks a, b, and c represent models, d, e and f are post processing or product generation tasks, and x represents external data that the upstream model depends on.

_images/timeline-one.png

Fig. 2 A single cycle point job schedule for real time operation. The optimal job schedule for two consecutive cycle points of our example suite during real time operation, assuming that all tasks trigger off upstream tasks finishing completely. The horizontal extent of a task bar represents its execution time, and the vertical blue lines show when the external driving data becomes available.

Fig. 2 shows the optimal job schedule for two consecutive cycle points of the example suite in real time operation, given execution times represented by the horizontal extent of the task bars. There is a time gap between cycle points as the suite waits on new external driving data. Each task in the example suite happens to trigger off upstream tasks finishing, rather than off any intermediate output or event; this is merely a simplification that makes for clearer diagrams.

_images/dep-two-cycles-linked.png

Fig. 3 What if the external driving data is available early? If the external driving data is available in advance, can we start running the next cycle point early?

_images/timeline-one-c.png

Fig. 4 Attempted overlap of consecutive single-cycle-point job schedules. A naive attempt to overlap two consecutive cycle points using the single-cycle-point dependency graph. The red shaded tasks will fail because of dependency violations (or will not be able to run because of upstream dependency violations).

_images/timeline-one-a.png

Fig. 5 The only safe multi-cycle-point job schedule? The best that can be done in general when inter-cycle dependence is ignored.

Now the question arises, what happens if the external driving data for upcoming cycle points is available in advance, as it would be after a significant delay in operations, or when running a historical case study? While the model a appears to depend only on the external data x, in fact it could also depend on its own previous instance for the model background state used in initializing the new run (this is almost always the case for atmospheric models used in weather forecasting). Thus, as alluded to in Fig. 3, task a could in principle start as soon as its predecessor has finished. Fig. 4 shows, however, that starting a whole new cycle point at this point is dangerous - it results in dependency violations in half of the tasks in the example suite. In fact the situation could be even worse than this - imagine that task b in the first cycle point is delayed for some reason after the second cycle point has been launched. Clearly we must consider handling inter-cycle dependence explicitly or else agree not to start the next cycle point early, as is illustrated in Fig. 5.

1.1.2. Inter-Cycle Dependence

Wether forecast models typically depend on their own most recent previous forecast for background state or restart files of some kind (this is called warm cycling) but there can also be inter-cycle dependence between other tasks. In an atmospheric forecast analysis suite, for instance, the weather model may generate background states for observation processing and data-assimilation tasks in the next cycle point as well as for the next forecast model run. In real time operation inter-cycle dependence can be ignored because it is automatically satisfied when one cycle point finishes before the next begins. If it is not ignored it drastically complicates the dependency graph by blurring the clean boundary between cycle points. Fig. 6 illustrates the problem for our simple example suite assuming minimal inter-cycle dependence: the warm cycled models (a, b, and c) each depend on their own previous instances.

For this reason, and because we tend to see forecasting suites in terms of their real time characteristics, other metaschedulers have ignored inter-cycle dependence and are thus restricted to running entire cycle points in sequence at all times. This does not affect normal real time operation but it can be a serious impediment when advance availability of external driving data makes it possible, in principle, to run some tasks from upcoming cycle points before the current cycle point is finished - as was suggested at the end of the previous section. This can occur, for instance, after operational delays (late arrival of external data, system maintenance, etc.) and to an even greater extent in historical case studies and parallel test suites started behind a real time operation. It can be a serious problem for suites that have little downtime between forecast cycle points and therefore take many cycle points to catch up after a delay. Without taking account of inter-cycle dependence, the best that can be done, in general, is to reduce the gap between cycle points to zero as shown in Fig. 5. A limited crude overlap of the single cycle point job schedule may be possible for specific task sets but the allowable overlap may change if new tasks are added, and it is still dangerous: it amounts to running different parts of a dependent system as if they were not dependent and as such it cannot be guaranteed that some unforeseen delay in one cycle point, after the next cycle point has begun, (e.g. due to resource contention or task failures) won’t result in dependency violations.

_images/dep-multi-cycle.png

Fig. 6 The complete multi-cycle-point dependency graph. The complete dependency graph for the example suite, assuming the least possible inter-cycle dependence: the models (a, b, and c) depend on their own previous instances. The dashed arrows show connections to previous and subsequent cycle points.

_images/timeline-two-cycles-optimal.png

Fig. 7 The optimal two-cycle-point job schedule. The optimal two cycle job schedule when the next cycle’s driving data is available in advance, possible in principle when inter-cycle dependence is handled explicitly.

Fig. 7 shows, in contrast to Fig. 4, the optimal two cycle point job schedule obtained by respecting all inter-cycle dependence. This assumes no delays due to resource contention or otherwise - i.e. every task runs as soon as it is ready to run. The scheduler running this suite must be able to adapt dynamically to external conditions that impact on multi-cycle-point scheduling in the presence of inter-cycle dependence or else, again, risk bringing the system down with dependency violations.

_images/timeline-three.png

Fig. 8 Comparison of job schedules after a delay. Job schedules for the example suite after a delay of almost one whole cycle point, when inter-cycle dependence is taken into account (above the time axis), and when it is not (below the time axis). The colored lines indicate the time that each cycle point is delayed, and normal “caught up” cycle points are shaded gray.

_images/timeline-two.png

Fig. 9 Optimal job schedule when all external data is available. Job schedules for the example suite in case study mode, or after a long delay, when the external driving data are available many cycle points in advance. Above the time axis is the optimal schedule obtained when the suite is constrained only by its true dependencies, as in Fig. 3, and underneath is the best that can be done, in general, when inter-cycle dependence is ignored.

To further illustrate the potential benefits of proper inter-cycle dependency handling, Fig. 8 shows an operational delay of almost one whole cycle point in a suite with little downtime between cycle points. Above the time axis is the optimal schedule that is possible in principle when inter-cycle dependence is taken into account, and below it is the only safe schedule possible in general when it is ignored. In the former case, even the cycle point immediately after the delay is hardly affected, and subsequent cycle points are all on time, whilst in the latter case it takes five full cycle points to catch up to normal real time operation [1].

Similarly, Fig. 9 shows example suite job schedules for an historical case study, or when catching up after a very long delay; i.e. when the external driving data are available many cycle points in advance. Task a, which as the most upstream model is likely to be a resource intensive atmosphere or ocean model, has no upstream dependence on co-temporal tasks and can therefore run continuously, regardless of how much downstream processing is yet to be completed in its own, or any previous, cycle point (actually, task a does depend on co-temporal task x which waits on the external driving data, but that returns immediately when the data is available in advance, so the result stands). The other models can also cycle continuously or with a short gap between, and some post processing tasks, which have no previous-instance dependence, can run continuously or even overlap (e.g. e in this case). Thus, even for this very simple example suite, tasks from three or four different cycle points can in principle run simultaneously at any given time.

In fact, if our tasks are able to trigger off internal outputs of upstream tasks (message triggers) rather than waiting on full completion, then successive instances of the models could overlap as well (because model restart outputs are generally completed early in the run) for an even more efficient job schedule.

[1]Note that simply overlapping the single cycle point schedules of Fig. 2 from the same start point would have resulted in dependency violation by task c. of task proxies in the pool.

2. Cylc Screenshots

_images/gcylc-graph-and-dot-views.png

Fig. 10 gcylc graph and dot views.

_images/gcylc-text-view.png

Fig. 11 gcylc text view.

_images/gscan.png

Fig. 12 gscan multi-suite state summary GUI.

_images/ecox-1.png

Fig. 13 A large-ish suite graphed by cylc.

3. Installation

Cylc runs on Linux. It is tested quite thoroughly on modern RHEL and Ubuntu distros. Some users have also managed to make it work on other Unix variants including Apple OS X, but they are not officially tested and supported.

3.1. Third-Party Software Packages

Python 2 >= 2.7 is required. Python 2 >= 2.7.9 is recommended for the best security. Python 2 should already be installed in your Linux system.

For Cylc’s HTTPS communications layer:

The following packages are highly recommended, but are technically optional as you can construct and run suites without dependency graph visualisation or the Cylc GUIs:

  • PyGTK - Python bindings for the GTK+ GUI toolkit.

    Note

    PyGTK typically comes with your system Python 2 version. It is allegedly quite difficult to install if you need to do so for another Python version. At time of writing, for instance, there are no functional PyGTK conda packages available.

    Note that we need to do ``import gtk`` in Python, not ``import pygtk``.

    In Centos 7.6, for example, the Cylc GUIs run “out of the box” with the system-installed Python 2.7.5. Under the hood, the Python “gtk” package is provided by the “pygtk2” yum package. (The “pygtk” Python module, which we don’t need, is supplied by the “pygobject2” yum package).

  • Graphviz - graph layout engine (tested 2.36.0)

  • Pygraphviz - Python Graphviz interface (tested 1.2). To build this you may need some devel packages too:

    • python-devel
    • graphviz-devel

    Note

    The cylc graph command for static workflow visualization requires PyGTK, but we provide a separate cylc ref-graph command to print out a simple text-format “reference graph” without PyGTK.

The Cylc Review service does not need any additional packages.

The following packages are necessary for running all the tests in Cylc:

To generate the HTML User Guide, you will need:

  • Sphinx of compatible version, >= 1.5.3 and <= 1.7.9.

In most modern Linux distributions all of the software above can be installed via the system package manager. Otherwise download packages manually and follow their native installation instructions. To check that all packages are installed properly:

$ cylc check-software
Checking your software...

Individual results:
===============================================================================
Package (version requirements)                          Outcome (version found)
===============================================================================
                              *REQUIRED SOFTWARE*
Python (2.7+, <3).....................FOUND & min. version MET (2.7.12.final.0)

       *OPTIONAL SOFTWARE for the GUI & dependency graph visualisation*
Python:pygraphviz (any)...........................................NOT FOUND (-)
graphviz (any)...................................................FOUND (2.26.0)
Python:pygtk (2.0+)...............................................NOT FOUND (-)

            *OPTIONAL SOFTWARE for the HTTPS communications layer*
Python:requests (2.4.2+)......................FOUND & min. version MET (2.11.1)
Python:urllib3 (any)..............................................NOT FOUND (-)
Python:OpenSSL (any)..............................................NOT FOUND (-)

             *OPTIONAL SOFTWARE for the configuration templating*
Python:EmPy (any).................................................NOT FOUND (-)

                *OPTIONAL SOFTWARE for the HTML documentation*
Python:sphinx (1.5.3+).........................FOUND & min. version MET (1.7.0)
===============================================================================

Summary:
                         ****************************
                             Core requirements: ok
                          Full-functionality: not ok
                         ****************************

If errors are reported then the packages concerned are either not installed or not in your Python search path.

Note

cylc check-software has become quite trivial as we’ve removed or bundled some former dependencies, but in future we intend to make it print a comprehensive list of library versions etc. to include in with bug reports.

To check for specific packages only, supply these as arguments to the check-software command, either in the form used in the output of the bare command, without any parent package prefix and colon, or alternatively all in lower-case, should the given form contain capitals. For example:

$ cylc check-software graphviz Python urllib3

With arguments, check-software provides an exit status indicating a collective pass (zero) or a failure of that number of packages to satisfy the requirements (non-zero integer).

3.2. Software Bundled With Cylc

Cylc bundles several third party packages which do not need to be installed separately.

  • cherrypy 6.0.2 (slightly modified): a pure Python HTTP framework that we use as a web server for communication between server processes (suite server programs) and client programs (running tasks, GUIs, CLI commands).
    • Client communication is via the Python requests library if available (recommended) or else pure Python via urllib2.
  • Jinja2 2.10: a full featured template engine for Python, and its dependency MarkupSafe 0.23; both BSD licensed.
  • the xdot graph viewer (modified), LGPL licensed.

3.3. Installing Cylc

Cylc releases can be downloaded from GitHub.

The wrapper script usr/bin/cylc should be installed to the system executable search path (e.g. /usr/local/bin/) and modified slightly to point to a location such as /opt where successive Cylc releases will be unpacked side by side.

To install Cylc, unpack the release tarball in the right location, e.g. /opt/cylc-7.8.2, type make inside the release directory, and set site defaults - if necessary - in a site global config file (below).

Make a symbolic link from cylc to the latest installed version: ln -s /opt/cylc-7.8.2 /opt/cylc. This will be invoked by the central wrapper if a specific version is not requested. Otherwise, the wrapper will attempt to invoke the Cylc version specified in $CYLC_VERSION, e.g. CYLC_VERSION=7.8.2. This variable is automatically set in task job scripts to ensure that jobs use the same Cylc version as their parent suite server program. It can also be set by users, manually or in login scripts, to fix the Cylc version in their environment.

Installing subsequent releases is just a matter of unpacking the new tarballs next to the previous releases, running make in them, and copying in (possibly with modifications) the previous site global config file.

3.3.1. Local User Installation

It is easy to install Cylc under your own user account if you don’t have root or sudo access to the system: just put the central Cylc wrapper in $HOME/bin/ (making sure that is in your $PATH) and modify it to point to a directory such as $HOME/cylc/ where you will unpack and install release tarballs. Local installation of third party dependencies like Graphviz is also possible, but that depends on the particular installation methods used and is outside of the scope of this document.

3.3.2. Create A Site Config File

Site and user global config files define some important parameters that affect all suites, some of which may need to be customized for your site. See Global (Site, User) Configuration Files for how to generate an initial site file and where to install it. All legal site and user global config items are defined in Global (Site, User) Config File Reference.

3.3.3. Configure Site Environment on Job Hosts

If your users submit task jobs to hosts other than the hosts they use to run their suites, you should ensure that the job hosts have the correct environment for running cylc. A cylc suite generates task job scripts that normally invoke bash -l, i.e. it will invoke bash as a login shell to run the job script. Users and sites should ensure that their bash login profiles are able to set up the correct environment for running cylc and their task jobs.

Your site administrator may customise the environment for all task jobs by adding a <cylc-dir>/etc/job-init-env.sh file and populate it with the appropriate contents. If customisation is still required, you can add your own ${HOME}/.cylc/job-init-env.sh file and populate it with the appropriate contents.

  • ${HOME}/.cylc/job-init-env.sh
  • <cylc-dir>/etc/job-init-env.sh

The job will attempt to source the first of these files it finds to set up its environment.

3.3.4. Configuring Cylc Review Under Apache

The Cylc Review web service displays suite job logs and other information in web pages - see Viewing Suite Logs in a Web Browser: Cylc Review and Fig. 14. It can run under a WSGI server (e.g. Apache with mod_wsgi) as a service for all users, or as an ad hoc service under your own user account.

To run Cylc Review under Apache, install mod_wsgi and configure it as follows, with paths modified appropriately:

# Apache mod_wsgi config file, e.g.:
#   Red Hat Linux: /etc/httpd/conf.d/cylc-wsgi.conf
#   Ubuntu Linux: /etc/apache2/mods-available/wsgi.conf
# E.g. for /opt/cylc-7.8.1/
WSGIPythonPath /opt/cylc-7.8.1/lib
WSGIScriptAlias /cylc-review /opt/cylc-7.8.1/bin/cylc-review

(Note the WSGIScriptAlias determines the service URL under the server root).

And allow Apache access to the Cylc library:

# Directory access, in main Apache config file, e.g.:
#   Red Hat Linux: /etc/httpd/conf/httpd.conf
#   Ubuntu Linux: /etc/apache2/apache2.conf
# E.g. for /opt/cylc-7.8.1/
<Directory /opt/cylc-7.8.1/>
        AllowOverride None
        Require all granted
</Directory>

The host running the Cylc Review web service, and the service itself (or the user that it runs as) must be able to view the ~/cylc-run directory of all Cylc users.

Use the web server log, e.g. /var/log/httpd/ or /var/log/apache2/, to debug problems.

3.4. Automated Tests

The cylc test battery is primarily intended for developers to check that changes to the source code don’t break existing functionality.

Note

Some test failures can be expected to result from suites timing out, even if nothing is wrong, if you run too many tests in parallel. See cylc test-battery --help.

4. Cylc Terminology

4.1. Jobs and Tasks

A job is a program or script that runs on a computer, and a task is a workflow abstraction - a node in the suite dependency graph - that represents a job.

4.2. Cycle Points

A cycle point is a particular date-time (or integer) point in a sequence of date-time (or integer) points. Each cylc task has a private cycle point and can advance independently to subsequent cycle points. It may sometimes be convenient, however, to refer to the “current cycle point” of a suite (or the previous or next one, etc.) with reference to a particular task, or in the sense of all tasks instances that “belong to” a particular cycle point. But keep in mind that different tasks may pass through the “current cycle point” (etc.) at different times as the suite evolves.

5. Workflows For Cycling Systems

A model run and associated processing may need to be cycled for the following reasons:

  • In real time forecasting systems, a new forecast may be initiated at regular intervals when new real time data comes in.
  • It may be convenient (or necessary, e.g. due to batch scheduler queue limits) to split single long model runs into many smaller chunks, each with associated pre- and post-processing workflows.

Cylc provides two ways of constructing workflows for cycling systems: cycling workflows and parameterized tasks.

5.1. Cycling Workflows

This is cylc’s classic cycling mode as described in the Introduction. Each instance of a cycling job is represented by a new instance of the same task, with a new cycle point. The suite configuration defines patterns for extending the workflow on the fly, so it can keep running indefinitely if necessary. For example, to cycle model.exe on a monthly sequence we could define a single task model, an initial cycle point, and a monthly sequence. Cylc then generates the date-time sequence and creates a new task instance for each cycle point as it comes up. Workflow dependencies are defined generically with respect to the “current cycle point” of the tasks involved.

This is the only sensible way to run very large suites or operational suites that need to continue cycling indefinitely. The cycling is configured with standards-based ISO 8601 date-time recurrence expressions. Multiple cycling sequences can be used at once in the same suite. See Scheduling - Dependency Graphs.

5.2. Parameterized Tasks as a Proxy for Cycling

It is also possible to run cycling jobs with a pre-defined static workflow in which each instance of a cycling job is represented by a different task: as far as the abstract workflow is concerned there is no cycling. The sequence of tasks can be constructed efficiently, however, using cylc’s built-in suite parameters (Parameterized Cycling) or explicit Jinja2 loops (Jinja2).

For example, to run model.exe 12 times on a monthly cycle we could loop over an integer parameter R = 0, 1, 2, ..., 11 to define tasks model-R0, model-R1, model-R2, ...model-R11, and the parameter values could be multiplied by the interval P1M (one month) to get the start point for the corresponding model run.

This method is only good for smaller workflows of finite duration because every single task has to be mapped out in advance, and cylc has to be aware of all of them throughout the entire run. Additionally Cylc’s cycling workflow capabilities (above) are more powerful, more flexible, and generally easier to use (Cylc will generate the cycle point date-times for you, for instance), so that is the recommended way to drive most cycling systems.

The primary use for parameterized tasks in cylc is to generate ensembles and other groups of related tasks at the same cycle point, not as a proxy for cycling.

5.3. Mixed Cycling Workflows

For completeness we note that parameterized cycling can be used within a cycling workflow. For example, in a daily cycling workflow long (daily) model runs could be split into four shorter runs by parameterized cycling. A simpler six-hourly cycling workflow should be considered first, however.

6. Global (Site, User) Configuration Files

Cylc site and user global configuration files contain settings that affect all suites. Some of these, such as the range of network ports used by cylc, should be set at site level. Legal items, values, and system defaults are documented in (Global (Site, User) Config File Reference).

# cylc site global config file
<cylc-dir>/etc/global.rc

Others, such as the preferred text editor for suite configurations, can be overridden by users,

# cylc user global config file
~/.cylc/$(cylc --version)/global.rc  # e.g. ~/.cylc/7.8.2/global.rc

The file <cylc-dir>/etc/global.rc.eg contains instructions on how to generate and install site and user global config files:

#------------------------------------------------------------------------------
# How to create a site or user global.rc config file.
#------------------------------------------------------------------------------
# The "cylc get-global-config" command prints - in valid global.rc format -
# system global defaults, overridden by site global settings (if any),
# overridden by user global settings (if any).
#
# Therefore, to generate a new global config file, do this:
#   % cylc get-global-config > global.rc
# edit it as needed and install it in the right location (below).
#
# For legal config items, see the User Guide's global.rc reference appendix.
#
# FILE LOCATIONS:
#----------------------
# SITE: delete or comment out items that you do not need to change (otherwise
# you may unwittingly override future changes to system defaults).
#
# The SITE FILE LOCATION is [cylc-dir]/etc/global.rc, where [cylc-dir] is your
# install location, e.g. /opt/cylc/cylc-7.8.2.
#
# FORWARD COMPATIBILITY: The site global.rc file must be kept in the source
# installation (i.e. it is version specific) because older versions of Cylc
# will not understand newer global config items. WHEN YOU INSTALL A NEW VERSION
# OF CYLC, COPY OVER YOUR OLD SITE GLOBAL CONFIG FILE AND ADD TO IT IF NEEDED.
#
#----------------------
# USER: delete or comment out items that you do not need to change (otherwise
# you may unwittingly override future changes to site or system defaults).
#
# The USER FILE LOCATIONS are:
#   1) ~/.cylc/<CYLC-VERSION>/global.rc  # e.g. ~/.cylc/7.8.2/global.rc
#   2) ~/.cylc/global.rc
# If the first lcoation is not found, the second will be checked.
# 
# The version-specific location is preferred - see FORWARD COMPATIBILITY above.
# WHEN YOU FIRST USE A NEW VERSION OF CYLC, COPY OVER YOUR OLD USER GLOBAL
# CONFIG FILE AND TO IT IF NEEDED. However, if you only set a items that don't
# change from one version to the next, you may be OK with the second location.
#------------------------------------------------------------------------------

7. Tutorial

This section provides a hands-on tutorial introduction to basic cylc functionality.

7.1. User Config File

Some settings affecting cylc’s behaviour can be defined in site and user global config files. For example, to choose the text editor invoked by cylc on suite configurations:

# $HOME/.cylc/$(cylc --version)/global.rc
[editors]
    terminal = vim
    gui = gvim -f

7.1.1. Configure Environment on Job Hosts

See Configure Site Environment on Job Hosts for information.

7.2. User Interfaces

You should have access to the cylc command line (CLI) and graphical (GUI) user interfaces once cylc has been installed as described in Section Installing Cylc.

7.2.1. Command Line Interface (CLI)

The command line interface is unified under a single top level cylc command that provides access to many sub-commands and their help documentation.

$ cylc help       # Top level command help.
$ cylc run --help # Example command-specific help.

Command help transcripts are printed in Command Reference and are available from the GUI Help menu.

Cylc is scriptable - the error status returned by commands can be relied on.

7.2.2. Graphical User Interface (GUI)

The cylc GUI covers the same functionality as the CLI, but it has more sophisticated suite monitoring capability. It can start and stop suites, or connect to suites that are already running; in either case, shutting down the GUI does not affect the suite itself.

$ gcylc & # or:
$ cylc gui & # Single suite control GUI.
$ cylc gscan & # Multi-suite monitor GUI.

Clicking on a suite in gscan, shown in Fig. 12, opens a gcylc instance for it.

7.3. Suite Configuration

Cylc suites are defined by extended-INI format suite.rc files (the main file format extension is section nesting). These reside in suite configuration directories that may also contain a bin directory and any other suite-related files.

7.4. Suite Registration

Suite registration creates a run directory (under ~/cylc-run/ by default) and populates it with authentication files and a symbolic link to a suite configuration directory. Cylc commands that parse suites can take the file path or the suite name as input. Commands that interact with running suites have to target the suite by name.

# Target a suite by file path:
$ cylc validate /path/to/my/suite/suite.rc
$ cylc graph /path/to/my/suite/suite.rc

# Register a suite:
$ cylc register my.suite /path/to/my/suite/

# Target a suite by name:
$ cylc graph my.suite
$ cylc validate my.suite
$ cylc run my.suite
$ cylc stop my.suite
# etc.

7.5. Suite Passphrases

Registration (above) also generates a suite-specific passphrase file under .service/ in the suite run directory. It is loaded by the suite server program at start-up and used to authenticate connections from client programs.

Possession of a suite’s passphrase file gives full control over it. Without it, the information available to a client is determined by the suite’s public access privilege level.

For more on connection authentication, suite passphrases, and public access, see Client-Server Interaction.

7.6. Import The Example Suites

Run the following command to copy cylc’s example suites and register them for your own use:

$ cylc import-examples /tmp

7.7. Rename The Imported Tutorial Suites

Suites can be renamed by simply renaming (i.e. moving) their run directories. Make the tutorial suite names shorter, and print their locations with cylc print:

$ mv ~/cylc-run/examples/$(cylc --version)/tutorial ~/cylc-run/tut
$ cylc print -ya tut
tut/oneoff/jinja2  | /tmp/cylc-examples/7.0.0/tutorial/oneoff/jinja2
tut/cycling/two    | /tmp/cylc-examples/7.0.0/tutorial/cycling/two
tut/cycling/three  | /tmp/cylc-examples/7.0.0/tutorial/cycling/three
# ...

See cylc print --help for other display options.

7.8. Suite Validation

Suite configurations can be validated to detect syntax (and other) errors:

# pass:
$ cylc validate tut/oneoff/basic
Valid for cylc-6.0.0
$ echo $?
0
# fail:
$ cylc validate my/bad/suite
Illegal item: [scheduling]special tusks
$ echo $?
1

7.9. Hello World in Cylc

suite: tut/oneoff/basic

Here’s the traditional Hello World program rendered as a cylc suite:

[meta]
    title = "The cylc Hello World! suite"
[scheduling]
    [[dependencies]]
        graph = "hello"
[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"

Cylc suites feature a clean separation of scheduling configuration, which determines when tasks are ready to run; and runtime configuration, which determines what to run (and where and how to run it) when a task is ready. In this example the [scheduling] section defines a single task called hello that triggers immediately when the suite starts up. When the task finishes the suite shuts down. That this is a dependency graph will be more obvious when more tasks are added. Under the [runtime] section the script item defines a simple inlined implementation for hello: it sleeps for ten seconds, then prints Hello World!, and exits. This ends up in a job script generated by cylc to encapsulate the task (below) and, thanks to some defaults designed to allow quick prototyping of new suites, it is submitted to run as a background job on the suite host. In fact cylc even provides a default task implementation that makes the entire [runtime] section technically optional:

[meta]
    title = "The minimal complete runnable cylc suite"
[scheduling]
    [[dependencies]]
        graph = "foo"
# (actually, 'title' is optional too ... and so is this comment)

(the resulting dummy task just prints out some identifying information and exits).

7.10. Editing Suites

The text editor invoked by Cylc on suite configurations is determined by cylc site and user global config files, as shown above in User Interfaces. Check that you have renamed the tutorial examples suites as described just above and open the Hello World suite in your text editor:

$ cylc edit tut/oneoff/basic # in-terminal
$ cylc edit -g tut/oneoff/basic & # or GUI

Alternatively, start gcylc on the suite:

$ gcylc tut/oneoff/basic &

and choose Suite -> Edit from the menu.

The editor will be invoked from within the suite configuration directory for easy access to other suite files (in this case there are none). There are syntax highlighting control files for several text editors under <cylc-dir>/etc/syntax/; see in-file comments for installation instructions.

7.11. Running Suites

7.11.1. CLI

Run tut/oneoff/basic using the cylc run command. As a suite runs detailed timestamped information is written to a suite log and progress can be followed with cylc’s suite monitoring tools (below). By default a suite server program daemonizes after printing a short message so that you can exit the terminal or even log out without killing the suite:

$ cylc run tut/oneoff/basic
            ._.
            | |                 The Cylc Suite Engine [7.0.0]
._____._. ._| |_____.           Copyright (C) NIWA & British Crown (Met Office) & Contributors.
| .___| | | | | .___|  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
| !___| !_! | | !___.  This program comes with ABSOLUTELY NO WARRANTY;
!_____!___. |_!_____!  see `cylc warranty`.  It is free software, you
      .___! |           are welcome to redistribute it under certain
      !_____!                conditions; see `cylc conditions`.

*** listening on https://nwp-1:43027/ ***

To view suite server program contact information:
 $ cylc get-suite-contact tut/oneoff/basic

Other ways to see if the suite is still running:
 $ cylc scan -n '\btut/oneoff/basic\b' nwp-1
 $ cylc ping -v --host=nwp-1 tut/oneoff/basic
 $ ps h -opid,args 123456  # on nwp-1

If you’re quick enough (this example only takes 10-15 seconds to run) the cylc scan command will detect the running suite:

$ cylc scan
tut/oneoff/basic oliverh@nwp-1:43027

Note

You can use the --no-detach and --debug options to cylc-run to prevent the suite from daemonizing (i.e. to make it stay attached to your terminal until it exits).

When a task is ready cylc generates a job script to run it, by default as a background jobs on the suite host. The job process ID is captured, and job output is directed to log files in standard locations under the suite run directory.

Log file locations relative to the suite run directory look like job/1/hello/01/ where the first digit is the cycle point of the task hello (for non-cycling tasks this is just 1); and the final 01 is the submit number (so that job logs do not get overwritten if a job is resubmitted for any reason).

The suite shuts down automatically once all tasks have succeeded.

7.11.2. GUI

The cylc GUI can start and stop suites, or (re)connect to suites that are already running:

$ cylc gui tut/oneoff/basic &

Use the tool bar Play button, or the Control -> Run menu item, to run the suite again. You may want to alter the suite configuration slightly to make the task take longer to run. Try right-clicking on the hello task to view its output logs. The relative merits of the three suite views - dot, text, and graph - will be more apparent later when we have more tasks. Closing the GUI does not affect the suite itself.

7.12. Remote Suites

Suites can run on localhost or on a remote host.

To start up a suite on a given host, specify it explicitly via the --host= option to a run or restart command.

Otherwise, Cylc selects the best host to start up on from allowed run hosts as specified in the global config under [suite servers], which defaults to localhost. Should there be more than one allowed host set, the most suitable is determined according to the settings specified under [[run host select]], namely exclusion of hosts not meeting suitability thresholds, if provided, then ranking according to the given rank method.

7.13. Discovering Running Suites

Suites that are currently running can be detected with command line or GUI tools:

# list currently running suites and their port numbers:
$ cylc scan
tut/oneoff/basic oliverh@nwp-1:43001

# GUI summary view of running suites:
$ cylc gscan &

The scan GUI is shown in Fig. 12; clicking on a suite in it opens gcylc.

7.14. Task Identifiers

At run time, task instances are identified by their name (see Task and Namespace Names), which is determined entirely by the suite configuration, and a cycle point which is usually a date-time or an integer:

foo.20100808T00Z   # a task with a date-time cycle point
bar.1              # a task with an integer cycle point (could be non-cycling)

Non-cycling tasks usually just have the cycle point 1, but this still has to be used to target the task instance with cylc commands.

7.15. Job Submission: How Tasks Are Executed

suite: tut/oneoff/jobsub

Task job scripts are generated by cylc to wrap the task implementation specified in the suite configuration (environment, script, etc.) in error trapping code, messaging calls to report task progress back to the suite server program, and so forth. Job scripts are written to the suite job log directory where they can be viewed alongside the job output logs. They can be accessed at run time by right-clicking on the task in the cylc GUI, or printed to the terminal:

$ cylc cat-log tut/oneoff/basic hello.1

This command can also print the suite log (and stdout and stderr for suites in daemon mode) and task stdout and stderr logs (see cylc cat-log --help).

A new job script can also be generated on the fly for inspection:

$ cylc jobscript tut/oneoff/basic hello.1

Take a look at the job script generated for hello.1 during the suite run above. The custom scripting should be clearly visible toward the bottom of the file.

The hello task in the first tutorial suite defaults to running as a background job on the suite host. To submit it to the Unix at scheduler instead, configure its job submission settings as in tut/oneoff/jobsub:

[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
        [[[job]]]
            batch system = at

Run the suite again after checking that at is running on your system.

Cylc supports a number of different batch systems. Tasks submitted to external batch queuing systems like at, PBS, SLURM, Moab, or LoadLeveler, are displayed as submitted in the cylc GUI until they start executing.

7.16. Locating Suite And Task Output

If the --no-detach option is not used, suite stdout and stderr will be directed to the suite run directory along with the time-stamped suite log file, and task job scripts and job logs (task stdout and stderr). The default suite run directory location is $HOME/cylc-run:

$ tree $HOME/cylc-run/tut/oneoff/basic/
|-- .service              # location of run time service files
|    |-- contact          # detail on how to contact the running suite
|    |-- db               # private suite run database
|    |-- passphrase       # passphrase for client authentication
|    |-- source           # symbolic link to source directory
|    |-- ssl.cert         # SSL certificate for the suite server
|    `-- ssl.pem          # SSL private key
|-- cylc-suite.db         # back compat symlink to public suite run database
|-- share                 # suite share directory (not used in this example)
|-- work                  # task work space (sub-dirs are deleted if not used)
|    `-- 1                   # task cycle point directory (or 1)
|        `-- hello              # task work directory (deleted if not used)
|-- log                   # suite log directory
|   |-- db                   # public suite run database
|   |-- job                  # task job log directory
|   |   `-- 1                   # task cycle point directory (or 1)
|   |       `-- hello              # task name
|   |           |-- 01                # task submission number
|   |           |   |-- job              # task job script
|   |           |   `-- job-activity.log # task job activity log
|   |           |   |-- job.err          # task stderr log
|   |           |   |-- job.out          # task stdout log
|   |           |   `-- job.status       # task status file
|   |           `-- NN -> 01          # symlink to latest submission number
|   `-- suite                # suite server log directory
|       |-- err                 # suite server stderr log (daemon mode only)
|       |-- out                 # suite server stdout log (daemon mode only)
|       `-- log                 # suite server event log (timestamped info)

The suite run database files, suite environment file, and task status files are used internally by cylc. Tasks execute in private work/ directories that are deleted automatically if empty when the task finishes. The suite share/ directory is made available to all tasks (by $CYLC_SUITE_SHARE_DIR) as a common share space. The task submission number increments from 1 if a task retries; this is used as a sub-directory of the log tree to avoid overwriting log files from earlier job submissions.

The top level run directory location can be changed in site and user config files if necessary, and the suite share and work locations can be configured separately because of the potentially larger disk space requirement.

Task job logs can be viewed by right-clicking on tasks in the gcylc GUI (so long as the task proxy is live in the suite), manually accessed from the log directory (of course), or printed to the terminal with the cylc cat-log command:

# suite logs:
$ cylc cat-log    tut/oneoff/basic           # suite event log
$ cylc cat-log -o tut/oneoff/basic           # suite stdout log
$ cylc cat-log -e tut/oneoff/basic           # suite stderr log
# task logs:
$ cylc cat-log    tut/oneoff/basic hello.1   # task job script
$ cylc cat-log -o tut/oneoff/basic hello.1   # task stdout log
$ cylc cat-log -e tut/oneoff/basic hello.1   # task stderr log

7.17. Viewing Suite Logs in a Web Browser: Cylc Review

The Cylc Review web service displays suite job logs and other information in web pages, as shown in Fig. 14. It can run under a WSGI server (e.g. Apache with mod_wsgi) as a service for all users, or as an ad hoc service under your own user account.

If a central Cylc Review service has been set up at your site (e.g. as described in Configuring Cylc Review Under Apache) the URL will typically be something like http://<server>/cylc-review/.

_images/cylc-review-screenshot.png

Fig. 14 Screenshot of a Cylc Review web page

Otherwise, to start an ad hoc Cylc Review service to view your own suite logs (or those of others, if you have read access to them), run:

setsid cylc review start 0</dev/null 1>/dev/null 2>&1 &

The service should start at http://<server>:8080 (the port number can optionally be set on the command line). Service logs are written to ~/.cylc/cylc-review*. Run cylc review to view status information, and cylc review stop to stop the service.

7.18. Remote Tasks

suite: tut/oneoff/remote

The hello task in the first two tutorial suites defaults to running on the suite host Remote Suites. To make it run on a different host instead change its runtime configuration as in tut/oneoff/remote:

[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
        [[[remote]]]
            host = server1.niwa.co.nz

In general, a task remote is a user account, other than the account running the suite server program, where a task job is submitted to run. It can be on the same machine running the suite or on another machine.

A task remote account must satisfy several requirements:

  • Non-interactive ssh must be enabled from the account running the suite server program to the account for submitting (and managing) the remote task job.
  • Network settings must allow communication back from the remote task job to the suite, either by network ports or ssh, unless the last-resort one way task polling communication method is used.
  • Cylc must be installed and runnable on the task remote account. Other software dependencies like graphviz are not required there.
  • Any files needed by a remote task must be installed on the task host. In this example there is nothing to install because the implementation of hello is inlined in the suite configuration and thus ends up entirely contained within the task job script.

If your username is different on the task host, you can add a User setting for the relevant host in your ~/.ssh/config. If you are unable to do so, the [[[remote]]] section also supports an owner=username item.

If you configure a task account according to the requirements cylc will invoke itself on the remote account (with a login shell by default) to create log directories, transfer any essential service files, send the task job script over, and submit it to run there by the configured batch system.

Remote task job logs are saved to the suite run directory on the task remote, not on the account running the suite. They can be retrieved by right-clicking on the task in the GUI, or to have cylc pull them back to the suite account automatically do this:

[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
        [[[remote]]]
            host = server1.niwa.co.nz
            retrieve job logs = True

This suite will attempt to rsync job logs from the remote host each time a task job completes.

Some batch systems have considerable delays between the time when the job completes and when it writes the job logs in its normal location. If this is the case, you can configure an initial delay and retry delays for job log retrieval by setting some delays. E.g.:

[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
        [[[remote]]]
            host = server1.niwa.co.nz
            retrieve job logs = True
            # Retry after 10 seconds, 1 minute and 3 minutes
            retrieve job logs retry delays = PT10S, PT1M, PT3M

Finally, if the disk space of the suite host is limited, you may want to set [[[remote]]]retrieve job logs max size=SIZE. The value of SIZE can be anything that is accepted by the --max-size=SIZE option of the rsync command. E.g.:

[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
        [[[remote]]]
            host = server1.niwa.co.nz
            retrieve job logs = True
            # Don't get anything bigger than 10MB
            retrieve job logs max size = 10M

It is worth noting that cylc uses the existence of a job’s job.out or job.err in the local file system to indicate a successful job log retrieval. If retrieve job logs max size=SIZE is set and both job.out and job.err are bigger than SIZE then cylc will consider the retrieval as failed. If retry delays are specified, this will trigger some useless (but harmless) retries. If this occurs regularly, you should try the following:

  • Reduce the verbosity of STDOUT or STDERR from the task.
  • Redirect the verbosity from STDOUT or STDERR to an alternate log file.
  • Adjust the size limit with tolerance to the expected size of STDOUT or STDERR.
  • For more on remote tasks see Remote Task Hosting
  • For more on task communications, see Tracking Task State.
  • For more on suite passphrases and authentication, see Suite Passphrases and Client-Server Interaction.

7.19. Task Triggering

suite: tut/oneoff/goodbye

To make a second task called goodbye trigger after hello finishes successfully, return to the original example, tut/oneoff/basic, and change the suite graph as in tut/oneoff/goodbye:

[scheduling]
    [[dependencies]]
        graph = "hello => goodbye"

or to trigger it at the same time as hello,

[scheduling]
    [[dependencies]]
        graph = "hello & goodbye"

and configure the new task’s behaviour under [runtime]:

[runtime]
    [[goodbye]]
        script = "sleep 10; echo Goodbye World!"

Run tut/oneoff/goodbye and check the output from the new task:

$ cat ~/cylc-run/tut/oneoff/goodbye/log/job/1/goodbye/01/job.out
  # or
$ cylc cat-log -o tut/oneoff/goodbye goodbye.1
JOB SCRIPT STARTING
cylc (scheduler - 2014-08-14T15:09:30+12): goodbye.1 started at 2014-08-14T15:09:30+12
cylc Suite and Task Identity:
  Suite Name  : tut/oneoff/goodbye
  Suite Host  : oliverh-34403dl.niwa.local
  Suite Port  : 43001
  Suite Owner : oliverh
  Task ID     : goodbye.1
  Task Host   : nwp-1
  Task Owner  : oliverh
  Task Try No.: 1

Goodbye World!
cylc (scheduler - 2014-08-14T15:09:40+12): goodbye.1 succeeded at 2014-08-14T15:09:40+12
JOB SCRIPT EXITING (TASK SUCCEEDED)

7.19.1. Task Failure And Suicide Triggering

suite: tut/oneoff/suicide

Task names in the graph string can be qualified with a state indicator to trigger off task states other than success:

    graph = """
a => b        # trigger b if a succeeds
c:submit => d # trigger d if c submits
e:finish => f # trigger f if e succeeds or fails
g:start  => h # trigger h if g starts executing
i:fail   => j # trigger j if i fails
            """

A common use of this is to automate recovery from known modes of failure:

graph = "goodbye:fail => really_goodbye"

i.e. if task goodbye fails, trigger another task that (presumably) really says goodbye.

Failure triggering generally requires use of suicide triggers as well, to remove the recovery task if it isn’t required (otherwise it would hang about indefinitely in the waiting state):

[scheduling]
    [[dependencies]]
        graph = """hello => goodbye
            goodbye:fail => really_goodbye
         goodbye => !really_goodbye # suicide"""

This means if goodbye fails, trigger really_goodbye; and otherwise, if goodbye succeeds, remove really_goodbye from the suite.

Try running tut/oneoff/suicide, which also configures the hello task’s runtime to make it fail, to see how this works.

7.20. Runtime Inheritance

suite: tut/oneoff/inherit

The [runtime] section is actually a multiple inheritance hierarchy. Each subsection is a namespace that represents a task, or if it is inherited by other namespaces, a family. This allows common configuration to be factored out of related tasks very efficiently.

[meta]
    title = "Simple runtime inheritance example"
[scheduling]
    [[dependencies]]
        graph = "hello => goodbye"
[runtime]
    [[root]]
        script = "sleep 10; echo $GREETING World!"
    [[hello]]
        [[[environment]]]
            GREETING = Hello
    [[goodbye]]
        [[[environment]]]
            GREETING = Goodbye

The [root] namespace provides defaults for all tasks in the suite. Here both tasks inherit script from root, which they customize with different values of the environment variable $GREETING.

Note

Inheritance from root is implicit; from other parents an explicit inherit = PARENT is required, as shown below.

7.21. Triggering Families

suite: tut/oneoff/ftrigger1

Task families defined by runtime inheritance can also be used as shorthand in graph trigger expressions. To see this, consider two “greeter” tasks that trigger off another task foo:

[scheduling]
    [[dependencies]]
        graph = "foo => greeter_1 & greeter_2"

If we put the common greeting functionality of greeter_1 and greeter_2 into a special GREETERS family, the graph can be expressed more efficiently like this:

[scheduling]
    [[dependencies]]
        graph = "foo => GREETERS"

i.e. if foo succeeds, trigger all members of GREETERS at once. Here’s the full suite with runtime hierarchy shown:

[meta]
    title = "Triggering a family of tasks"
[scheduling]
    [[dependencies]]
        graph = "foo => GREETERS"
[runtime]
    [[root]]
        pre-script = "sleep 10"
    [[foo]]
        # empty (creates a dummy task)
    [[GREETERS]]
        script = "echo $GREETING World!"
    [[greeter_1]]
        inherit = GREETERS
        [[[environment]]]
            GREETING = Hello
    [[greeter_2]]
        inherit = GREETERS
        [[[environment]]]
            GREETING = Goodbye

Note

We recommend given ALL-CAPS names to task families to help distinguish them from task names. However, this is just a convention.

Experiment with the tut/oneoff/ftrigger1 suite to see how this works.

7.22. Triggering Off Of Families

suite: tut/oneoff/ftrigger2

Tasks (or families) can also trigger off other families, but in this case we need to specify what the trigger means in terms of the upstream family members. Here’s how to trigger another task bar if all members of GREETERS succeed:

[scheduling]
    [[dependencies]]
        graph = """foo => GREETERS
            GREETERS:succeed-all => bar"""

Verbose validation in this case reports:

$ cylc val -v tut/oneoff/ftrigger2
...
Graph line substitutions occurred:
  IN: GREETERS:succeed-all => bar
  OUT: greeter_1:succeed & greeter_2:succeed => bar
...

Cylc ignores family member qualifiers like succeed-all on the right side of a trigger arrow, where they don’t make sense, to allow the two graph lines above to be combined in simple cases:

[scheduling]
    [[dependencies]]
        graph = "foo => GREETERS:succeed-all => bar"

Any task triggering status qualified by -all or -any, for the members, can be used with a family trigger. For example, here’s how to trigger bar if all members of GREETERS finish (succeed or fail) and any of them succeed:

[scheduling]
    [[dependencies]]
        graph = """foo => GREETERS
    GREETERS:finish-all & GREETERS:succeed-any => bar"""

(use of GREETERS:succeed-any by itself here would trigger bar as soon as any one member of GREETERS completed successfully). Verbose validation now begins to show how family triggers can simplify complex graphs, even for this tiny two-member family:

$ cylc val -v tut/oneoff/ftrigger2
...
Graph line substitutions occurred:
  IN: GREETERS:finish-all & GREETERS:succeed-any => bar
  OUT: ( greeter_1:succeed | greeter_1:fail ) & \
       ( greeter_2:succeed | greeter_2:fail ) & \
       ( greeter_1:succeed | greeter_2:succeed ) => bar
...

Experiment with tut/oneoff/ftrigger2 to see how this works.

7.23. Suite Visualization

You can style dependency graphs with an optional [visualization] section, as shown in tut/oneoff/ftrigger2:

[visualization]
    default node attributes = "style=filled"
    [[node attributes]]
        foo = "fillcolor=#6789ab", "color=magenta"
        GREETERS = "fillcolor=#ba9876"
        bar = "fillcolor=#89ab67"

To display the graph in an interactive viewer:

$ cylc graph tut/oneoff/ftrigger2 &    # dependency graph
$ cylc graph -n tut/oneoff/ftrigger2 & # runtime inheritance graph

It should look like Fig. 15 (with the GREETERS family node expanded on the right).

_images/tut-hello-multi-1.png
_images/tut-hello-multi-2.png
_images/tut-hello-multi-3.png

Fig. 16 The tut/oneoff/ftrigger2 dependency and runtime inheritance graphs

Graph styling can be applied to entire families at once, and custom “node groups” can also be defined for non-family groups.

7.24. External Task Scripts

suite: tut/oneoff/external

The tasks in our examples so far have all had inlined implementation, in the suite configuration, but real tasks often need to call external commands, scripts, or executables. To try this, let’s return to the basic Hello World suite and cut the implementation of the task hello out to a file hello.sh in the suite bin directory:

#!/bin/sh

set -e

GREETING=${GREETING:-Goodbye}
echo "$GREETING World! from $0"

Make the task script executable, and change the hello task runtime section to invoke it:

[meta]
    title = "Hello World! from an external task script"
[scheduling]
    [[dependencies]]
        graph = "hello"
[runtime]
    [[hello]]
        pre-script = sleep 10
        script = hello.sh
        [[[environment]]]
            GREETING = Hello

If you run the suite now the new greeting from the external task script should appear in the hello task stdout log. This works because cylc automatically adds the suite bin directory to $PATH in the environment passed to tasks via their job scripts. To execute scripts (etc.) located elsewhere you can refer to the file by its full file path, or set $PATH appropriately yourself (this could be done via $HOME/.profile, which is sourced at the top of the task job script, or in the suite configuration itself).

Note

The use of set -e above to make the script abort on error. This allows the error trapping code in the task job script to automatically detect unforeseen errors.

7.25. Cycling Tasks

suite: tut/cycling/one

So far we’ve considered non-cycling tasks, which finish without spawning a successor.

Cycling is based around iterating through date-time or integer sequences. A cycling task may run at each cycle point in a given sequence (cycle). For example, a sequence might be a set of date-times every 6 hours starting from a particular date-time. A cycling task may run for each date-time item (cycle point) in that sequence.

There may be multiple instances of this type of task running in parallel, if the opportunity arises and their dependencies allow it. Alternatively, a sequence can be defined with only one valid cycle point - in that case, a task belonging to that sequence may only run once.

Open the tut/cycling/one suite:

[meta]
    title = "Two cycling tasks, no inter-cycle dependence"
[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20130808T00
    final cycle point = 20130812T00
    [[dependencies]]
        [[[T00,T12]]] # 00 and 12 hours UTC every day
            graph = "foo => bar"
[visualization]
    initial cycle point = 20130808T00
    final cycle point = 20130809T00
    [[node attributes]]
        foo = "color=red"
        bar = "color=blue"

The difference between cycling and non-cycling suites is all in the [scheduling] section, so we will leave the [runtime] section alone for now (this will result in cycling dummy tasks).

Note

The graph is now defined under a new section heading that makes each task under it have a succession of cycle points ending in 00 or 12 hours, between specified initial and final cycle points (or indefinitely if no final cycle point is given), as shown in Fig. 17.

_images/tut-one.png

Fig. 17 The tut/cycling/one suite

If you run this suite instances of foo will spawn in parallel out to the runahead limit, and each bar will trigger off the corresponding instance of foo at the same cycle point. The runahead limit, which defaults to a few cycles but is configurable, prevents uncontrolled spawning of cycling tasks in suites that are not constrained by clock triggers in real time operation.

Experiment with tut/cycling/one to see how cycling tasks work.

7.25.1. ISO 8601 Date-Time Syntax

The suite above is a very simple example of a cycling date-time workflow. More generally, cylc comprehensively supports the ISO 8601 standard for date-time instants, intervals, and sequences. Cycling graph sections can be specified using full ISO 8601 recurrence expressions, but these may be simplified by assuming context information from the suite - namely initial and final cycle points. One form of the recurrence syntax looks like Rn/start-date-time/period (Rn means run n times). In the example above, if the initial cycle point is always at 00 or 12 hours then [[[T00,T12]]] could be written as [[[PT12H]]], which is short for [[[R/initial-cycle-point/PT12H/]]] - i.e. run every 12 hours indefinitely starting at the initial cycle point. It is possible to add constraints to the suite to only allow initial cycle points at 00 or 12 hours e.g.

[scheduling]
    initial cycle point = 20130808T00
    initial cycle point constraints = T00, T12

7.25.2. Inter-Cycle Triggers

suite: tut/cycling/two

The tut/cycling/two suite adds inter-cycle dependence to the previous example:

[scheduling]
    [[dependencies]]
        # Repeat with cycle points of 00 and 12 hours every day:
        [[[T00,T12]]]
            graph = "foo[-PT12H] => foo => bar"

For any given cycle point in the sequence defined by the cycling graph section heading, bar triggers off foo as before, but now foo triggers off its own previous instance foo[-PT12H]. Date-time offsets in inter-cycle triggers are expressed as ISO 8601 intervals (12 hours in this case). Fig. 18 shows how this connects the cycling graph sections together.

_images/tut-two.png

Fig. 18 The tut/cycling/two suite

Experiment with this suite to see how inter-cycle triggers work.

Note

The first instance of foo, at suite start-up, will trigger immediately in spite of its inter-cycle trigger, because cylc ignores dependence on points earlier than the initial cycle point. However, the presence of an inter-cycle trigger usually implies something special has to happen at start-up. If a model depends on its own previous instance for restart files, for example, then some special process has to generate the initial set of restart files when there is no previous cycle point to do it. The following section shows one way to handle this in cylc suites.

7.25.3. Initial Non-Repeating (R1) Tasks

suite: tut/cycling/three

Sometimes we want to be able to run a task at the initial cycle point, but refrain from running it in subsequent cycles. We can do this by writing an extra set of dependencies that are only valid at a single date-time cycle point. If we choose this to be the initial cycle point, these will only apply at the very start of the suite.

The cylc syntax for writing this single date-time cycle point occurrence is R1, which stands for R1/no-specified-date-time/no-specified-period. This is an adaptation of part of the ISO 8601 date-time standard’s recurrence syntax (Rn/date-time/period) with some special context information supplied by cylc for the no-specified-* data.

The 1 in the R1 means run once. As we’ve specified no date-time, Cylc will use the initial cycle point date-time by default, which is what we want. We’ve also missed out specifying the period - this is set by cylc to a zero amount of time in this case (as it never repeats, this is not significant).

For example, in tut/cycling/three:

[cylc]
    cycle point time zone = +13
[scheduling]
    initial cycle point = 20130808T00
    final cycle point = 20130812T00
    [[dependencies]]
        [[[R1]]]
            graph = "prep => foo"
        [[[T00,T12]]]
            graph = "foo[-PT12H] => foo => bar"

This is shown in Fig. 19.

Note

The time zone has been set to +1300 in this case, instead of UTC (Z) as before. If no time zone or UTC mode was set, the local time zone of your machine will be used in the cycle points.

At the initial cycle point, foo will depend on foo[-PT12H] and also on prep:

prep.20130808T0000+13 & foo.20130807T1200+13 => foo.20130808T0000+13

Thereafter, it will just look like e.g.:

foo.20130808T0000+13 => foo.20130808T1200+13

However, in our initial cycle point example, the dependence on foo.20130807T1200+13 will be ignored, because that task’s cycle point is earlier than the suite’s initial cycle point and so it cannot run. This means that the initial cycle point dependencies for foo actually look like:

prep.20130808T0000+13 => foo.20130808T0000+13
_images/tut-three.png

Fig. 19 The tut/cycling/three suite

  • R1 tasks can also be used to make something special happen at suite shutdown, or at any single cycle point throughout the suite run. For a full primer on cycling syntax, see Advanced Examples.

7.25.4. Integer Cycling

suite: tut/cycling/integer

Cylc can do also do integer cycling for repeating workflows that are not date-time based.

Open the tut/cycling/integer suite, which is plotted in Fig. 20.

[scheduling]
    cycling mode = integer
    initial cycle point = 1
    final cycle point = 3
    [[dependencies]]
        [[[R1]]] # = R1/1/?
            graph = start => foo
        [[[P1]]] # = R/1/P1
            graph = foo[-P1] => foo => bar
        [[[R2/P1]]] # = R2/P1/3
            graph = bar => stop

[visualization]
    [[node attributes]]
        start = "style=filled", "fillcolor=skyblue"
        foo = "style=filled", "fillcolor=slategray"
        bar = "style=filled", "fillcolor=seagreen3"
        stop = "style=filled", "fillcolor=orangered"
_images/tut-cyc-int.png

Fig. 20 The tut/cycling/integer suite

The integer cycling notation is intended to look similar to the ISO 8601 date-time notation, but it is simpler for obvious reasons. The example suite illustrates two recurrence forms, Rn/start-point/period and Rn/period/stop-point, simplified somewhat using suite context information (namely the initial and final cycle points). The first form is used to run one special task called start at start-up, and for the main cycling body of the suite; and the second form to run another special task called stop in the final two cycles. The P character denotes period (interval) just like in the date-time notation. R/1/P2 would generate the sequence of points 1,3,5,....

  • For more on integer cycling, including a more realistic usage example see Integer Cycling.

7.26. Jinja2

suite: tut/oneoff/jinja2

Cylc has built in support for the Jinja2 template processor, which allows us to embed code in suite configurations to generate the final result seen by cylc.

The tut/oneoff/jinja2 suite illustrates two common uses of Jinja2: changing suite content or structure based on the value of a logical switch; and iteratively generating dependencies and runtime configuration for groups of related tasks:

#!jinja2

{% set MULTI = True %}
{% set N_GOODBYES = 3 %}

[meta]
    title = "A Jinja2 Hello World! suite"
[scheduling]
    [[dependencies]]
{% if MULTI %}
        graph = "hello => BYE"
{% else %}
        graph = "hello"
{% endif %}

[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
{% if MULTI %}
    [[BYE]]
        script = "sleep 10; echo Goodbye World!"
    {% for I in range(0,N_GOODBYES) %}
    [[ goodbye_{{I}} ]]
        inherit = BYE
    {% endfor %}
{% endif %}

To view the result of Jinja2 processing with the Jinja2 flag MULTI set to False:

$ cylc view --jinja2 --stdout tut/oneoff/jinja2
[meta]
    title = "A Jinja2 Hello World! suite"
[scheduling]
    [[dependencies]]
        graph = "hello"
[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"

And with MULTI set to True:

$ cylc view --jinja2 --stdout tut/oneoff/jinja2
[meta]
    title = "A Jinja2 Hello World! suite"
[scheduling]
    [[dependencies]]
        graph = "hello => BYE"
[runtime]
    [[hello]]
        script = "sleep 10; echo Hello World!"
    [[BYE]]
        script = "sleep 10; echo Goodbye World!"
    [[ goodbye_0 ]]
        inherit = BYE
    [[ goodbye_1 ]]
        inherit = BYE
    [[ goodbye_2 ]]
        inherit = BYE

7.27. Task Retry On Failure

suite: tut/oneoff/retry

Tasks can be configured to retry a number of times if they fail. An environment variable $CYLC_TASK_TRY_NUMBER increments from 1 on each successive try, and is passed to the task to allow different behaviour on the retry:

[meta]
    title = "A task with automatic retry on failure"
[scheduling]
    [[dependencies]]
        graph = "hello"
[runtime]
    [[hello]]
        script = """
sleep 10
if [[ $CYLC_TASK_TRY_NUMBER < 3 ]]; then
    echo "Hello ... aborting!"
    exit 1
else
    echo "Hello World!"
fi"""
        [[[job]]]
            execution retry delays = 2*PT6S # retry twice after 6-second delays

If a task with configured retries fails, it goes into the retrying state until the next retry delay is up, then it resubmits. It only enters the failed state on a final definitive failure.

If a task with configured retries is killed (by cylc kill or via the GUI) it goes to the held state so that the operator can decide whether to release it and continue the retry sequence or to abort the retry sequence by manually resetting it to the failed state.

Experiment with tut/oneoff/retry to see how this works.

7.28. Other Users’ Suites

If you have read access to another user’s account (even on another host) it is possible to use cylc monitor to look at their suite’s progress without full shell access to their account. To do this, you will need to copy their suite passphrase to

$HOME/.cylc/SUITE_OWNER@SUITE_HOST/SUITE_NAME/passphrase

(use of the host and owner names is optional here - see Full Control - With Auth Files) and also retrieve the port number of the running suite from:

~SUITE_OWNER/cylc-run/SUITE_NAME/.service/contact

Once you have this information, you can run

$ cylc monitor --user=SUITE_OWNER --port=SUITE_PORT SUITE_NAME

to view the progress of their suite.

Other suite-connecting commands work in the same way; see Remote Control.

7.29. Other Things To Try

Almost every feature of cylc can be tested quickly and easily with a simple dummy suite. You can write your own, or start from one of the example suites in /path/to/cylc/examples (see use of cylc import-examples above) - they all run “out the box” and can be copied and modified at will.

  • Change the suite runahead limit in a cycling suite.
  • Stop a suite mid-run with cylc stop, and restart it again with cylc restart.
  • Hold (pause) a suite mid-run with cylc hold, then modify the suite configuration and cylc reload it before using cylc release to continue (you can also reload without holding).
  • Use the gcylc View menu to show the task state color key and watch tasks in the task-states example evolve as the suite runs.
  • Manually re-run a task that has already completed or failed, with cylc trigger.
  • Use an internal queue to prevent more than an alotted number of tasks from running at once even though they are ready - see Limiting Activity With Internal Queues.
  • Configure task event hooks to send an email, or shut the suite down, on task failure.

8. Suite Name Registration

8.1. Suite Registration

Cylc commands target suites via their names, which are relative path names under the suite run directory (~/cylc-run/ by default). Suites can be grouped together under sub-directories. E.g.:

$ cylc print -t nwp
nwp
 |-oper
 | |-region1  Local Model Region1       /home/oliverh/cylc-run/nwp/oper/region1
 | `-region2  Local Model Region2       /home/oliverh/cylc-run/nwp/oper/region2
 `-test
   `-region1  Local Model TEST Region1  /home/oliverh/cylc-run/nwp/test/region1

Suite names can be pre-registered with the cylc register command, which creates the suite run directory structure and some service files underneath it. Otherwise, cylc run will do this at suite start up.

8.2. Suite Names

Suite names are not validated. Names for suites can be anything that is a valid filename within your operating system’s file system, which includes restrictions on name length (as described under Task and Namespace Names), with the exceptions of:

  • /, which is not supported for general filenames on e.g. Linux systems but is allowed for suite names to generate hierarchical suites (see register);
  • while possible in filenames on many systems, it is strongly advised that suite names do not contain any whitespace characters (e.g. as in my suite).

9. Suite Configuration

Cylc suites are defined in structured, validated, suite.rc files that concisely specify the properties of, and the relationships between, the various tasks managed by the suite. This section of the User Guide deals with the format and content of the suite.rc file, including task definition. Task implementation - what’s required of the real commands, scripts, or programs that do the processing that the tasks represent - is covered in Task Implementation; and task job submission - how tasks are submitted to run - is in Task Job Submission and Management.

9.1. Suite Configuration Directories

A cylc suite configuration directory contains:

  • A suite.rc file: this is the suite configuration.
    • And any include-files used in it (see below; may be kept in sub-directories).
  • A bin/ sub-directory (optional)
    • For scripts and executables that implement, or are used by, suite tasks.
    • Automatically added to $PATH in task execution environments.
    • Alternatively, tasks can call external commands, scripts, or programs; or they can be scripted entirely within the suite.rc file.
  • A lib/python/ sub-directory (optional)
  • Any other sub-directories and files - documentation, control files, etc. (optional)
    • Holding everything in one place makes proper suite revision control possible.
    • Portable access to files here, for running tasks, is provided through $CYLC_SUITE_DEF_PATH (see Task Execution Environment).
    • Ignored by cylc, but the entire suite configuration directory tree is copied when you copy a suite using cylc commands.

A typical example:

/path/to/my/suite   # suite configuration directory
    suite.rc           # THE SUITE CONFIGURATION FILE
    bin/               # scripts and executables used by tasks
        foo.sh
        bar.sh
        ...
    # (OPTIONAL) any other suite-related files, for example:
    inc/               # suite.rc include-files
        nwp-tasks.rc
        globals.rc
        ...
    doc/               # documentation
    control/           # control files
    ancil/             # ancillary files
    ...

9.2. Suite.rc File Overview

Suite.rc files are an extended-INI format with section nesting.

Embedded template processor expressions may also be used in the file, to programatically generate the final suite configuration seen by cylc. Currently the Jinja2 and EmPy template processors are supported; see Jinja2 and EmPy for examples. In the future cylc may provide a plug-in interface to allow use of other template engines too.

9.2.1. Syntax

The following defines legal suite.rc syntax:

  • Items are of the form item = value.
  • [Section] headings are enclosed in square brackets.
  • Sub-section [[nesting]] is defined by repeated square brackets.
    • Sections are closed by the next section heading.
  • Comments (line and trailing) follow a hash character: #
  • List values are comma-separated.
  • Single-line string values can be single-, double-, or un-quoted.
  • Multi-line string values are triple-quoted (using single or double quote characters).
  • Boolean values are capitalized: True, False.
  • Leading and trailing whitespace is ignored.
  • Indentation is optional but should be used for clarity.
  • Continuation lines follow a trailing backslash: \
  • Duplicate sections add their items to those previously defined under the same section.
  • Duplicate items override, except for dependency ``graph`` strings, which are additive.
  • Include-files %include inc/foo.rc can be used as a verbatim inlining mechanism.

Suites that embed templating code (see Jinja2 and EmPy) must process to raw suite.rc syntax.

9.2.2. Include-Files

Cylc has native support for suite.rc include-files, which may help to organize large suites. Inclusion boundaries are completely arbitrary - you can think of include-files as chunks of the suite.rc file simply cut-and-pasted into another file. Include-files may be included multiple times in the same file, and even nested. Include-file paths can be specified portably relative to the suite configuration directory, e.g.:

# include the file $CYLC_SUITE_DEF_PATH/inc/foo.rc:
%include inc/foo.rc
9.2.2.1. Editing Temporarily Inlined Suites

Cylc’s native file inclusion mechanism supports optional inlined editing:

$ cylc edit --inline SUITE

The suite will be split back into its constituent include-files when you exit the edit session. While editing, the inlined file becomes the official suite configuration so that changes take effect whenever you save the file. See cylc prep edit --help for more information.

9.2.2.2. Include-Files via Jinja2

Jinja2 (Jinja2) also has template inclusion functionality.

9.2.3. Syntax Highlighting For Suite Configuration

Cylc comes with syntax files for a number of text editors:

<cylc-dir>/etc/syntax/cylc.vim     # vim
<cylc-dir>/etc/syntax/cylc-mode.el # emacs
<cylc-dir>/etc/syntax/cylc.lang    # gedit (and other gtksourceview programs)
<cylc-dir>/etc/syntax/cylc.xml     # kate

Refer to comments at the top of each file to see how to use them.

9.2.4. Gross File Structure

Cylc suite.rc files consist of a suite title and description followed by configuration items grouped under several top level section headings:

  • [cylc] - non task-specific suite configuration
  • [scheduling] - determines when tasks are ready to run
    • tasks with special behaviour, e.g. clock-trigger tasks
    • the dependency graph, which defines the relationships between tasks
  • [runtime] - determines how, where, and what to execute when tasks are ready
    • script, environment, job submission, remote hosting, etc.
    • suite-wide defaults in the root namespace
    • a nested family hierarchy with common properties inherited by related tasks
  • [visualization] - suite graph styling

9.2.5. Validation

Cylc suite.rc files are automatically validated against a specification that defines all legal entries, values, options, and defaults. This detects formatting errors, typographic errors, illegal items and illegal values prior to run time. Some values are complex strings that require further parsing by cylc to determine their correctness (this is also done during validation). All legal entries are documented in (Suite.rc Reference).

The validator reports the line numbers of detected errors. Here’s an example showing a section heading with a missing right bracket:

$ cylc validate my.suite
    [[special tasks]
'Section bracket mismatch, line 19'

If the suite.rc file uses include-files cylc view will show an inlined copy of the suite with correct line numbers (you can also edit suites in a temporarily inlined state with cylc edit --inline).

Validation does not check the validity of chosen batch systems.

9.3. Scheduling - Dependency Graphs

The [scheduling] section of a suite.rc file defines the relationships between tasks in a suite - the information that allows cylc to determine when tasks are ready to run. The most important component of this is the suite dependency graph. Cylc graph notation makes clear textual graph representations that are very concise because sections of the graph that repeat at different hours of the day, say, only have to be defined once. Here’s an example with dependencies that vary depending on the particular cycle point:

[scheduling]
    initial cycle point = 20200401
    final cycle point = 20200405
    [[dependencies]]
        [[[T00,T06,T12,T18]]] # validity (hours)
            graph = """
A => B & C   # B and C trigger off A
A[-PT6H] => A  # Model A restart trigger
                    """
        [[[T06,T18]]] # hours
            graph = "C => X"

Fig. 21 shows the complete suite.rc listing alongside the suite graph. This is a complete, valid, runnable suite (it will use default task runtime properties such as script).

_images/dep-eg-1.png

Example Suite

[meta]
    title = "Dependency Example 1"
[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20200401
    final cycle point = 20200405
    [[dependencies]]
        [[[T00,T06,T12,T18]]] # validity (hours)
            graph = """
A => B & C   # B and C trigger off A
A[-PT6H] => A  # Model A restart trigger
                    """
        [[[T06,T18]]] # hours
            graph = "C => X"
[visualization]
    initial cycle point = 20200401
    final cycle point = 20200401T06
    [[node attributes]]
        X = "color=red"

9.3.1. Graph String Syntax

Multiline graph strings may contain:

  • blank lines
  • arbitrary white space
  • internal comments: following the # character
  • conditional task trigger expressions - see below.

9.3.2. Interpreting Graph Strings

Suite dependency graphs can be broken down into pairs in which the left side (which may be a single task or family, or several that are conditionally related) defines a trigger for the task or family on the right. For instance the “word graph” C triggers off B which triggers off A can be deconstructed into pairs C triggers off B and B triggers off A. In this section we use only the default trigger type, which is to trigger off the upstream task succeeding; see Task Triggering for other available triggers.

In the case of cycling tasks, the triggers defined by a graph string are valid for cycle points matching the list of hours specified for the graph section. For example this graph:

[scheduling]
    [[dependencies]]
        [[[T00,T12]]]
            graph = "A => B"

implies that B triggers off A for cycle points in which the hour matches 00 or 12.

To define inter-cycle dependencies, attach an offset indicator to the left side of a pair:

[scheduling]
    [[dependencies]]
        [[[T00,T12]]]
            graph = "A[-PT12H] => B"

This means B[time] triggers off A[time-PT12H] (12 hours before) for cycle points with hours matching 00 or 12. time is implicit because this keeps graphs clean and concise, given that the majority of tasks will typically depend only on others with the same cycle point. Cycle point offsets can only appear on the left of a pair, because a pairs define triggers for the right task at cycle point time. However, A => B[-PT6H], which is illegal, can be reformulated as a future trigger A[+PT6H] => B (see Inter-Cycle Triggers). It is also possible to combine multiple offsets within a cycle point offset e.g.

[scheduling]
    [[dependencies]]
        [[[T00,T12]]]
            graph = "A[-P1D-PT12H] => B"

This means that B[Time] triggers off A[time-P1D-PT12H] (1 day and 12 hours before).

Triggers can be chained together. This graph:

graph = """A => B  # B triggers off A
           B => C  # C triggers off B"""

is equivalent to this:

graph = "A => B => C"

Each trigger in the graph must be unique but the same task can appear in multiple pairs or chains. Separately defined triggers for the same task have an AND relationship. So this:

graph = """A => X  # X triggers off A
           B => X  # X also triggers off B"""

is equivalent to this:

graph = "A & B => X"  # X triggers off A AND B

In summary, the branching tree structure of a dependency graph can be partitioned into lines (in the suite.rc graph string) of pairs or chains, in any way you like, with liberal use of internal white space and comments to make the graph structure as clear as possible.

# B triggers if A succeeds, then C and D trigger if B succeeds:
    graph = "A => B => C & D"
# which is equivalent to this:
    graph = """A => B => C
               B => D"""
# and to this:
    graph = """A => B => D
               B => C"""
# and to this:
    graph = """A => B
               B => C
               B => D"""
# and it can even be written like this:
    graph = """A => B # blank line follows:

               B => C # comment ...
               B => D"""
9.3.2.1. Splitting Up Long Graph Lines

It is not necessary to use the general line continuation marker \ to split long graph lines. Just break at dependency arrows, or split long chains into smaller ones. This graph:

graph = "A => B => C"

is equivalent to this:

graph = """A => B =>
           C"""

and also to this:

graph = """A => B
           B => C"""

9.3.3. Graph Types

A suite configuration can contain multiple graph strings that are combined to generate the final graph.

9.3.3.1. One-off (Non-Cycling)

Fig. 22 shows a small suite of one-off non-cycling tasks; these all share a single cycle point (1) and don’t spawn successors (once they’re all finished the suite just exits). The integer 1 attached to each graph node is just an arbitrary label here.

_images/test1.png

One-off (Non-Cycling) Tasks.

[meta]
    title = some one-off tasks
[scheduling]
    [[dependencies]]
        graph = "foo => bar & baz => qux"
9.3.3.2. Cycling Graphs

For cycling tasks the graph section heading defines a sequence of cycle points for which the subsequent graph section is valid. Fig. 23 shows a small suite of cycling tasks.

_images/test2.png

Cycling Tasks.

[meta]
    title = some cycling tasks
# (no dependence between cycle points)
[scheduling]
    [[dependencies]]
        [[[T00,T12]]]
            graph = "foo => bar & baz => qux"

9.3.4. Graph Section Headings

Graph section headings define recurrence expressions, the graph within a graph section heading defines a workflow at each point of the recurrence. For example in the following scenario:

[scheduling]
    [[dependencies]]
        [[[ T06 ]]]  # A graph section heading
            graph = foo => bar

T06 means “Run every day starting at 06:00 after the initial cycle point”. Cylc allows you to start (or end) at any particular time, repeat at whatever frequency you like, and even optionally limit the number of repetitions.

Graph section heading can also be used with integer cycling see Integer Cycling.

9.3.4.1. Syntax Rules

Date-time cycling information is made up of a starting date-time, an interval, and an optional limit.

The time is assumed to be in the local time zone unless you set [cylc]cycle point time zone or [cylc]UTC mode. The calendar is assumed to be the proleptic Gregorian calendar unless you set [scheduling]cycling mode.

The syntax for representations is based on the ISO 8601 date-time standard. This includes the representation of date-time, interval. What we define for cylc’s cycling syntax is our own optionally-heavily-condensed form of ISO 8601 recurrence syntax. The most common full form is: R[limit?]/[date-time]/[interval]. However, we allow omitting information that can be guessed from the context (rules below). This means that it can be written as:

R[limit?]/[date-time]
R[limit?]//[interval]
[date-time]/[interval]
R[limit?] # Special limit of 1 case
[date-time]
[interval]

with example graph headings for each form being:

[[[ R5/T00 ]]]           # Run 5 times at 00:00 every day
[[[ R//PT1H ]]]          # Run every hour (Note the R// is redundant)
[[[ 20000101T00Z/P1D ]]] # Run every day starting at 00:00 1st Jan 2000
[[[ R1 ]]]               # Run once at the initial cycle point
[[[ R1/20000101T00Z ]]]  # Run once at 00:00 1st Jan 2000
[[[ P1Y ]]]              # Run every year

Note

T00 is an example of [date-time], with an inferred 1 day period and no limit.

Where some or all date-time information is omitted, it is inferred to be relative to the initial date-time cycle point. For example, T00 by itself would mean the next occurrence of midnight that follows, or is, the initial cycle point. Entering +PT6H would mean 6 hours after the initial cycle point. Entering -P1D would mean 1 day before the initial cycle point. Entering no information for the date-time implies the initial cycle point date-time itself.

Where the interval is omitted and some (but not all) date-time information is omitted, it is inferred to be a single unit above the largest given specific date-time unit. For example, the largest given specific unit in T00 is hours, so the inferred interval is 1 day (daily), P1D.

Where the limit is omitted, unlimited cycling is assumed. This will be bounded by the final cycle point’s date-time if given.

Another supported form of ISO 8601 recurrence is: R[limit?]/[interval]/[date-time]. This form uses the date-time as the end of the cycling sequence rather than the start. For example, R3/P5D/20140430T06 means:

20140420T06
20140425T06
20140430T06

This kind of form can be used for specifying special behaviour near the end of the suite, at the final cycle point’s date-time. We can also represent this in cylc with a collapsed form:

R[limit?]/[interval]
R[limit?]//[date-time]
[interval]/[date-time]

So, for example, you can write:

[[[ R1//+P0D ]]]  # Run once at the final cycle point
[[[ R5/P1D ]]]    # Run 5 times, every 1 day, ending at the final
                  # cycle point
[[[ P2W/T00 ]]]   # Run every 2 weeks ending at 00:00 following
                  # the final cycle point
[[[ R//T00 ]]]    # Run every 1 day ending at 00:00 following the
                  # final cycle point
9.3.4.2. Referencing The Initial And Final Cycle Points

For convenience the caret and dollar symbols may be used as shorthand for the initial and final cycle points. Using this shorthand you can write:

[[[ R1/^+PT12H ]]]  # Repeat once 12 hours after the initial cycle point
                    # R[limit]/[date-time]
                    # Equivalent to [[[ R1/+PT12H ]]]
[[[ R1/$ ]]]        # Repeat once at the final cycle point
                    # R[limit]/[date-time]
                    # Equivalent to [[[ R1//+P0D ]]]
[[[ $-P2D/PT3H ]]]  # Repeat 3 hourly starting two days before the
                    # [date-time]/[interval]
                    # final cycle point

Note

There can be multiple ways to write the same headings, for instance the following all run once at the final cycle point:

[[[ R1/P0Y ]]]      # R[limit]/[interval]
[[[ R1/P0Y/$ ]]]    # R[limit]/[interval]/[date-time]
[[[ R1/$ ]]]        # R[limit]/[date-time]
9.3.4.3. Excluding Dates

Date-times can be excluded from a recurrence by an exclamation mark for example [[[ PT1D!20000101 ]]] means run daily except on the first of January 2000.

This syntax can be used to exclude one or multiple date-times from a recurrence. Multiple date-times are excluded using the syntax [[[ PT1D!(20000101,20000102,...) ]]]. All date-times listed within the parentheses after the exclamation mark will be excluded.

Note

The ^ and $ symbols (shorthand for the initial and final cycle points) are both date-times so [[[ T12!$-PT1D ]]] is valid.

If using a run limit in combination with an exclusion, the heading might not run the number of times specified in the limit. For example in the following suite foo will only run once as its second run has been excluded.

[scheduling]
    initial cycle point = 20000101T00Z
    final cycle point = 20000105T00Z
    [[dependencies]]
        [[[ R2/P1D!20000102 ]]]
            graph = foo
9.3.4.4. Advanced exclusion syntax

In addition to excluding isolated date-time points or lists of date-time points from recurrences, exclusions themselves may be date-time recurrence sequences. Any partial date-time or sequence given after the exclamation mark will be excluded from the main sequence.

For example, partial date-times can be excluded using the syntax:

[[[ PT1H ! T12 ]]]          # Run hourly but not at 12:00 from the initial
                            # cycle point.
[[[ T-00 ! (T00, T06, T12, T18) ]]]   # Run hourly but not at 00:00, 06:00,
                                      # 12:00, 18:00.
[[[ PT5M ! T-15 ]]]         # Run 5-minutely but not at 15 minutes past the
                            # hour from the initial cycle point.
[[[ T00 ! W-1T00 ]]]        # Run daily at 00:00 except on Mondays.

It is also valid to use sequences for exclusions. For example:

[[[ PT1H ! PT6H ]]]         # Run hourly from the initial cycle point but
                            # not 6-hourly from the initial cycle point.
[[[ T-00 ! PT6H ]]]         # Run hourly on the hour but not 6-hourly
                            # on the hour.
    # Same as [[[ T-00 ! T-00/PT6H ]]] (T-00 context is implied)
    # Same as [[[ T-00 ! (T00, T06, T12, T18) ]]]
    # Same as [[[ PT1H ! (T00, T06, T12, T18) ]]] Initial cycle point dependent

[[[ T12 ! T12/P15D ]]]      # Run daily at 12:00 except every 15th day.

[[[ R/^/P1H ! R5/20000101T00/P1D ]]]    # Any valid recurrence may be used to
                                        # determine exclusions. This example
                                        # translates to: Repeat every hour from
                                        # the initial cycle point, but exclude
                                        # 00:00 for 5 days from the 1st January
                                        # 2000.

You can combine exclusion sequences and single point exclusions within a comma separated list enclosed in parentheses:

[[[ T-00 ! (20000101T07, PT2H) ]]]      # Run hourly on the hour but not at 07:00
                                        # on the 1st Jan, 2000 and not 2-hourly
                                        # on the hour.
9.3.4.5. How Multiple Graph Strings Combine

For a cycling graph with multiple validity sections for different hours of the day, the different sections add to generate the complete graph. Different graph sections can overlap (i.e. the same hours may appear in multiple section headings) and the same tasks may appear in multiple sections, but individual dependencies should be unique across the entire graph. For example, the following graph defines a duplicate prerequisite for task C:

[scheduling]
    [[dependencies]]
        [[[T00,T06,T12,T18]]]
            graph = "A => B => C"
        [[[T06,T18]]]
            graph = "B => C => X"
            # duplicate prerequisite: B => C already defined at T06, T18

This does not affect scheduling, but for the sake of clarity and brevity the graph should be written like this:

[scheduling]
    [[dependencies]]
        [[[T00,T06,T12,T18]]]
            graph = "A => B => C"
        [[[T06,T18]]]
            # X triggers off C only at 6 and 18 hours
            graph = "C => X"
9.3.4.6. Advanced Examples

The following examples show the various ways of writing graph headings in cylc.

[[[ R1 ]]]         # Run once at the initial cycle point
[[[ P1D ]]]        # Run every day starting at the initial cycle point
[[[ PT5M ]]]       # Run every 5 minutes starting at the initial cycle
                   # point
[[[ T00/P2W ]]]    # Run every 2 weeks starting at 00:00 after the
                   # initial cycle point
[[[ +P5D/P1M ]]]   # Run every month, starting 5 days after the initial
                   # cycle point
[[[ R1/T06 ]]]     # Run once at 06:00 after the initial cycle point
[[[ R1/P0Y ]]]     # Run once at the final cycle point
[[[ R1/$ ]]]       # Run once at the final cycle point (alternative
                   # form)
[[[ R1/$-P3D ]]]   # Run once three days before the final cycle point
[[[ R3/T0830 ]]]   # Run 3 times, every day at 08:30 after the initial
                   # cycle point
[[[ R3/01T00 ]]]   # Run 3 times, every month at 00:00 on the first
                   # of the month after the initial cycle point
[[[ R5/W-1/P1M ]]] # Run 5 times, every month starting on Monday
                   # following the initial cycle point
[[[ T00!^ ]]]      # Run at the first occurrence of T00 that isn't the
                   # initial cycle point
[[[ PT1D!20000101 ]]]  # Run every day days excluding 1st Jan 2000
[[[ 20140201T06/P1D ]]]    # Run every day starting at 20140201T06
[[[ R1/min(T00,T06,T12,T18) ]]]  # Run once at the first instance
                                 # of either T00, T06, T12 or T18
                                 # starting at the initial cycle
                                 # point
9.3.4.7. Advanced Starting Up

Dependencies that are only valid at the initial cycle point can be written using the R1 notation (e.g. as in Initial Non-Repeating (R1) Tasks. For example:

[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20130808T00
    final cycle point = 20130812T00
    [[dependencies]]
        [[[R1]]]
            graph = "prep => foo"
        [[[T00]]]
            graph = "foo[-P1D] => foo => bar"

In the example above, R1 implies R1/20130808T00, so prep only runs once at that cycle point (the initial cycle point). At that cycle point, foo will have a dependence on prep - but not at subsequent cycle points.

However, it is possible to have a suite that has multiple effective initial cycles - for example, one starting at T00 and another starting at T12. What if they need to share an initial task?

Let’s suppose that we add the following section to the suite example above:

[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20130808T00
    final cycle point = 20130812T00
    [[dependencies]]
        [[[R1]]]
            graph = "prep => foo"
        [[[T00]]]
            graph = "foo[-P1D] => foo => bar"
        [[[T12]]]
            graph = "baz[-P1D] => baz => qux"

We’ll also say that there should be a starting dependence between prep and our new task baz - but we still want to have a single prep task, at a single cycle.

We can write this using a special case of the task[-interval] syntax - if the interval is null, this implies the task at the initial cycle point.

For example, we can write our suite like Fig. 24.

_images/test4.png

Staggered Start Suite

[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20130808T00
    final cycle point = 20130812T00
    [[dependencies]]
        [[[R1]]]
            graph = "prep"
        [[[R1/T00]]]
# ^ implies the initial cycle point:
     graph = "prep[^] => foo"
        [[[R1/T12]]]
# ^ is initial cycle point, as above:
     graph = "prep[^] => baz"
        [[[T00]]]
     graph = "foo[-P1D] => foo => bar"
        [[[T12]]]
     graph = "baz[-P1D] => baz => qux"
[visualization]
    initial cycle point = 20130808T00
    final cycle point = 20130810T00
    [[node attributes]]
        foo = "color=red"
        bar = "color=orange"
        baz = "color=green"
        qux = "color=blue"

This neatly expresses what we want - a task running at the initial cycle point that has one-off dependencies with other task sets at different cycles.

_images/test5.png

Restricted First Cycle Point Suite

[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20130808T00
    final cycle point = 20130808T18
    [[dependencies]]
        [[[R1]]]
            graph = "setup_foo => foo"
        [[[+PT6H/PT6H]]]
            graph = """
                foo[-PT6H] => foo
                foo => bar
            """
[visualization]
    initial cycle point = 20130808T00
    final cycle point = 20130808T18
    [[node attributes]]
        foo = "color=red"
        bar = "color=orange"

A different kind of requirement is displayed in Fig. 25. Usually, we want to specify additional tasks and dependencies at the initial cycle point. What if we want our first cycle point to be entirely special, with some tasks missing compared to subsequent cycle points?

In Fig. 25, bar will not be run at the initial cycle point, but will still run at subsequent cycle points. [[[+PT6H/PT6H]]] means start at +PT6H (6 hours after the initial cycle point) and then repeat every PT6H (6 hours).

Some suites may have staggered start-up sequences where different tasks need running once but only at specific cycle points, potentially due to differing data sources at different cycle points with different possible initial cycle points. To allow this cylc provides a min( ) function that can be used as follows:

[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 20100101T03
    [[dependencies]]
        [[[R1/min(T00,T12)]]]
            graph = "prep1 => foo"
        [[[R1/min(T06,T18)]]]
            graph = "prep2 => foo"
        [[[T00,T06,T12,T18]]]
            graph = "foo => bar"

In this example the initial cycle point is 20100101T03, so the prep1 task will run once at 20100101T12 and the prep2 task will run once at 20100101T06 as these are the first cycle points after the initial cycle point in the respective min( ) entries.

9.3.4.8. Integer Cycling

In addition to non-repeating and date-time cycling workflows, cylc can do integer cycling for repeating workflows that are not date-time based.

To construct an integer cycling suite, set [scheduling]cycling mode = integer, and specify integer values for the initial and (optional) final cycle points. The notation for intervals, offsets, and recurrences (sequences) is similar to the date-time cycling notation, except for the simple integer values.

The full integer recurrence expressions supported are:

  • Rn/start-point/interval # e.g. R3/1/P2
  • Rn/interval/end-point # e.g. R3/P2/9

But, as for date-time cycling, sequence start and end points can be omitted where suite initial and final cycle points can be assumed. Some examples:

[[[ R1 ]]]        # Run once at the initial cycle point
                  # (short for R1/initial-point/?)
[[[ P1 ]]]        # Repeat with step 1 from the initial cycle point
                  # (short for R/initial-point/P1)
[[[ P5 ]]]        # Repeat with step 5 from the initial cycle point
                  # (short for R/initial-point/P5)
[[[ R2//P2 ]]]    # Run twice with step 3 from the initial cycle point
                  # (short for R2/initial-point/P2)
[[[ R/+P1/P2 ]]]  # Repeat with step 2, from 1 after the initial cycle point
[[[ R2/P2 ]]]     # Run twice with step 2, to the final cycle point
                  # (short for R2/P2/final-point)
[[[ R1/P0 ]]]     # Run once at the final cycle point
                  # (short for R1/P0/final-point)
9.3.4.8.1. Example

The tutorial illustrates integer cycling in Integer Cycling, and <cylc-dir>/etc/examples/satellite/ is a self-contained example of a realistic use for integer cycling. It simulates the processing of incoming satellite data: each new dataset arrives after a random (as far as the suite is concerned) interval, and is labeled by an arbitrary (as far as the suite is concerned) ID in the filename. A task called get_data at the top of the repeating workflow waits on the next dataset and, when it finds one, moves it to a cycle-point-specific shared workspace for processing by the downstream tasks. When get_data.1 finishes, get_data.2 triggers and begins waiting for the next dataset at the same time as the downstream tasks in cycle point 1 are processing the first one, and so on. In this way multiple datasets can be processed at once if they happen to come in quickly. A single shutdown task runs at the end of the final cycle to collate results. The suite graph is shown in Fig. 26.

_images/satellite.png

Fig. 26 The etc/examples/satellite integer suite.

9.3.4.8.2. Advanced Integer Cycling Syntax

The same syntax used to reference the initial and final cycle points (introduced in Referencing The Initial And Final Cycle Points) for use with date-time cycling can also be used for integer cycling. For example you can write:

[[[ R1/^ ]]]     # Run once at the initial cycle point
[[[ R1/$ ]]]     # Run once at the final cycle point
[[[ R3/^/P2 ]]]  # Run three times with step two starting at the
                 # initial cycle point

Likewise the syntax introduced in Excluding Dates for excluding a particular point from a recurrence also works for integer cycling. For example:

[[[ R/P4!8 ]]]       # Run with step 4, to the final cycle point
                     # but not at point 8
[[[ R3/3/P2!5 ]]]    # Run with step 2 from point 3 but not at
                     # point 5
[[[ R/+P1/P6!14 ]]]  # Run with step 6 from 1 step after the
                     # initial cycle point but not at point 14

Multiple integer exclusions are also valid in the same way as the syntax in Excluding Dates. Integer exclusions may be a list of single integer points, an integer sequence, or a combination of both:

[[[ R/P1!(2,3,7) ]]]  # Run with step 1 to the final cycle point,
                      # but not at points 2, 3, or 7.
[[[ P1 ! P2 ]]]       # Run with step 1 from the initial to final
                      # cycle point, skipping every other step from
                      # the initial cycle point.
[[[ P1 ! +P1/P2 ]]]   # Run with step 1 from the initial cycle point,
                      # excluding every other step beginning one step
                      # after the initial cycle point.
[[[ P1 !(P2,6,8) ]]]  # Run with step 1 from the initial cycle point,
                      # excluding every other step, and also excluding
                      # steps 6 and 8.

9.3.5. Task Triggering

A task is said to “trigger” when it submits its job to run, as soon as all of its dependencies (also known as its separate “triggers”) are met. Tasks can be made to trigger off of the state of other tasks (indicated by a :state qualifier on the upstream task (or family) name in the graph) and, and off the clock, and arbitrary external events.

External triggering is relatively more complicated, and is documented separately in External Triggers.

9.3.5.1. Success Triggers

The default, with no trigger type specified, is to trigger off the upstream task succeeding:

# B triggers if A SUCCEEDS:
    graph = "A => B"

For consistency and completeness, however, the success trigger can be explicit:

# B triggers if A SUCCEEDS:
    graph = "A => B"
# or:
    graph = "A:succeed => B"
9.3.5.2. Failure Triggers

To trigger off the upstream task reporting failure:

# B triggers if A FAILS:
    graph = "A:fail => B"

Suicide triggers can be used to remove task B here if A does not fail, see Suicide Triggers.

9.3.5.3. Start Triggers

To trigger off the upstream task starting to execute:

# B triggers if A STARTS EXECUTING:
    graph = "A:start => B"

This can be used to trigger tasks that monitor other tasks once they (the target tasks) start executing. Consider a long-running forecast model, for instance, that generates a sequence of output files as it runs. A postprocessing task could be launched with a start trigger on the model (model:start => post) to process the model output as it becomes available. Note, however, that there are several alternative ways of handling this scenario: both tasks could be triggered at the same time (foo => model & post), but depending on external queue delays this could result in the monitoring task starting to execute first; or a different postprocessing task could be triggered off a message output for each data file (model:out1 => post1 etc.; see Message Triggers), but this may not be practical if the number of output files is large or if it is difficult to add cylc messaging calls to the model.

9.3.5.4. Finish Triggers

To trigger off the upstream task succeeding or failing, i.e. finishing one way or the other:

# B triggers if A either SUCCEEDS or FAILS:
    graph = "A | A:fail => B"
# or
    graph = "A:finish => B"
9.3.5.5. Message Triggers

Tasks can also trigger off custom output messages. These must be registered in the [runtime] section of the emitting task, and reported using the cylc message command in task scripting. The graph trigger notation refers to the item name of the registered output message. The example suite <cylc-dir>/etc/examples/message-triggers illustrates message triggering.

[meta]
    title = "test suite for cylc-6 message triggers"

[scheduling]
    initial cycle point = 20140801T00
    final cycle point = 20141201T00
    [[dependencies]]
        [[[P2M]]]
           graph = """foo:out1 => bar
                      foo[-P2M]:out2 => baz"""
[runtime]
    [[foo]]
        script = """
sleep 5
cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" "file 1 done"
sleep 10
cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" "file 2 done"
sleep 10"""
        [[[outputs]]]
            out1 = "file 1 done"
            out2 = "file 2 done"
    [[bar, baz]]
        script = sleep 10
9.3.5.6. Job Submission Triggers

It is also possible to trigger off a task submitting, or failing to submit:

# B triggers if A submits successfully:
    graph = "A:submit => B"
# D triggers if C fails to submit successfully:
    graph = "C:submit-fail => D"

A possible use case for submit-fail triggers: if a task goes into the submit-failed state, possibly after several job submission retries, another task that inherits the same runtime but sets a different job submission method and/or host could be triggered to, in effect, run the same job on a different platform.

9.3.5.7. Conditional Triggers

AND operators (&) can appear on both sides of an arrow. They provide a concise alternative to defining multiple triggers separately:

# 1/ this:
    graph = "A & B => C"
# is equivalent to:
    graph = """A => C
               B => C"""
# 2/ this:
    graph = "A => B & C"
# is equivalent to:
    graph = """A => B
               A => C"""
# 3/ and this:
    graph = "A & B => C & D"
# is equivalent to this:
    graph = """A => C
               B => C
               A => D
               B => D"""

OR operators (|) which result in true conditional triggers, can only appear on the left [1] :

# C triggers when either A or B finishes:
    graph = "A | B => C"

Forecasting suites typically have simple conditional triggering requirements, but any valid conditional expression can be used, as shown in Fig. 27 (conditional triggers are plotted with open arrow heads).

_images/conditional-triggers.png

Conditional triggers, which are plotted with open arrow heads.

        graph = """
# D triggers if A or (B and C) succeed
A | B & C => D
# just to align the two graph sections
D => W
# Z triggers if (W or X) and Y succeed
(W|X) & Y => Z
                """
9.3.5.8. Suicide Triggers

Suicide triggers take tasks out of the suite. This can be used for automated failure recovery. The suite.rc listing and accompanying graph in Fig. 28 show how to define a chain of failure recovery tasks that trigger if they’re needed but otherwise remove themselves from the suite (you can run the AutoRecover.async example suite to see how this works). The dashed graph edges ending in solid dots indicate suicide triggers, and the open arrowheads indicate conditional triggers as usual. Suicide triggers are ignored by default in the graph view, unless you toggle them on with View -> Options -> Ignore Suicide Triggers.

_images/suicide.png

Automated failure recovery via suicide triggers.

[meta]
    title = automated failure recovery
    description = """
Model task failure triggers diagnosis
and recovery tasks, which take themselves
out of the suite if model succeeds. Model
post processing triggers off model OR
recovery tasks.
              """
[scheduling]
    [[dependencies]]
        graph = """
pre => model
model:fail => diagnose => recover
model => !diagnose & !recover
model | recover => post
                """
[runtime]
    [[model]]
        # UNCOMMENT TO TEST FAILURE:
        # script = /bin/false

Note

Multiple suicide triggers combine in the same way as other triggers, so this:

foo => !baz
bar => !baz

is equivalent to this:

foo & bar => !baz

i.e. both foo and bar must succeed for baz to be taken out of the suite. If you really want a task to be taken out if any one of several events occurs then be careful to write it that way:

foo | bar => !baz

Warning

A word of warning on the meaning of “bare suicide triggers”. Consider the following suite:

[scheduling]
    [[dependencies]]
        graph = "foo => !bar"

Task bar has a suicide trigger but no normal prerequisites (a suicide trigger is not a task triggering prerequisite, it is a task removal prerequisite) so this is entirely equivalent to:

[scheduling]
    [[dependencies]]
        graph = """
            foo & bar
           foo => !bar
                """

In other words both tasks will trigger immediately, at the same time, and then bar will be removed if foo succeeds.

If an active task proxy (currently in the submitted or running states) is removed from the suite by a suicide trigger, a warning will be logged.

9.3.5.9. Family Triggers

Families defined by the namespace inheritance hierarchy (Runtime - Task Configuration) can be used in the graph trigger whole groups of tasks at the same time (e.g. forecast model ensembles and groups of tasks for processing different observation types at the same time) and for triggering downstream tasks off families as a whole. Higher level families, i.e. families of families, can also be used, and are reduced to the lowest level member tasks.

Note

Tasks can also trigger off individual family members if necessary.

To trigger an entire task family at once:

[scheduling]
    [[dependencies]]
        graph = "foo => FAM"
[runtime]
    [[FAM]]    # a family (because others inherit from it)
    [[m1,m2]]  # family members (inherit from namespace FAM)
        inherit = FAM

This is equivalent to:

[scheduling]
    [[dependencies]]
        graph = "foo => m1 & m2"
[runtime]
    [[FAM]]
    [[m1,m2]]
        inherit = FAM

To trigger other tasks off families we have to specify whether to triggering off all members starting, succeeding, failing, or finishing, or off any members (doing the same). Legal family triggers are thus:

[scheduling]
    [[dependencies]]
        graph = """
      # all-member triggers:
    FAM:start-all => one
    FAM:succeed-all => one
    FAM:fail-all => one
    FAM:finish-all => one
      # any-member triggers:
    FAM:start-any => one
    FAM:succeed-any => one
    FAM:fail-any => one
    FAM:finish-any => one
                """

Here’s how to trigger downstream processing after if one or more family members succeed, but only after all members have finished (succeeded or failed):

[scheduling]
    [[dependencies]]
        graph = """
    FAM:finish-all & FAM:succeed-any => foo
                """
9.3.5.10. Efficient Inter-Family Triggering

While cylc allows writing dependencies between two families it is important to consider the number of dependencies this will generate. In the following example, each member of FAM2 has dependencies pointing at all the members of FAM1.

[scheduling]
    [[dependencies]]
        graph = """
    FAM1:succeed-any => FAM2
                """

Expanding this out, you generate N * M dependencies, where N is the number of members of FAM1 and M is the number of members of FAM2. This can result in high memory use as the number of members of these families grows, potentially rendering the suite impractical for running on some systems.

You can greatly reduce the number of dependencies generated in these situations by putting dummy tasks in the graphing to represent the state of the family you want to trigger off. For example, if FAM2 should trigger off any member of FAM1 succeeding you can create a dummy task FAM1_succeed_any_marker and place a dependency on it as follows:

[scheduling]
    [[dependencies]]
        graph = """
    FAM1:succeed-any => FAM1_succeed_any_marker => FAM2
                """
[runtime]
# ...
    [[FAM1_succeed_any_marker]]
        script = true
# ...

This graph generates only N + M dependencies, which takes significantly less memory and CPU to store and evaluate.

9.3.5.11. Inter-Cycle Triggers

Typically most tasks in a suite will trigger off others in the same cycle point, but some may depend on others with other cycle points. This notably applies to warm-cycled forecast models, which depend on their own previous instances (see below); but other kinds of inter-cycle dependence are possible too [2] . Here’s how to express this kind of relationship in cylc:

[dependencies]
    [[PT6H]]
        # B triggers off A in the previous cycle point
        graph = "A[-PT6H] => B"

inter-cycle and trigger type (or message trigger) notation can be combined:

# B triggers if A in the previous cycle point fails:
graph = "A[-PT6H]:fail => B"

At suite start-up inter-cycle triggers refer to a previous cycle point that does not exist. This does not cause the dependent task to wait indefinitely, however, because cylc ignores triggers that reach back beyond the initial cycle point. That said, the presence of an inter-cycle trigger does normally imply that something special has to happen at start-up. If a model depends on its own previous instance for restart files, for instance, then an initial set of restart files has to be generated somehow or the first model task will presumably fail with missing input files. There are several ways to handle this in cylc using different kinds of one-off (non-cycling) tasks that run at suite start-up. They are illustrated in Inter-Cycle Triggers; to summarize here briefly:

  • R1 tasks (recommended):

    [scheduling]
        [[dependencies]]
            [[[R1]]]
                graph = "prep"
            [[[R1/T00,R1/T12]]]
                graph = "prep[^] => foo"
            [[[T00,T12]]]
                graph = "foo[-PT12H] => foo => bar"
    

R1, or R1/date-time tasks are the recommended way to specify unusual start up conditions. They allow you to specify a clean distinction between the dependencies of initial cycles and the dependencies of the subsequent cycles.

Initial tasks can be used for real model cold-start processes, whereby a warm-cycled model at any given cycle point can in principle have its inputs satisfied by a previous instance of itself, or by an initial task with (nominally) the same cycle point.

In effect, the R1 task masquerades as the previous-cycle-point trigger of its associated cycling task. At suite start-up initial tasks will trigger the first cycling tasks, and thereafter the inter-cycle trigger will take effect.

If a task has a dependency on another task in a different cycle point, the dependency can be written using the [offset] syntax such as [-PT12H] in foo[-PT12H] => foo. This means that foo at the current cycle point depends on a previous instance of foo at 12 hours before the current cycle point. Unlike the cycling section headings (e.g. [[[T00,T12]]]), dependencies assume that relative times are relative to the current cycle point, not the initial cycle point.

However, it can be useful to have specific dependencies on tasks at or near the initial cycle point. You can switch the context of the offset to be the initial cycle point by using the caret symbol: ^.

For example, you can write foo[^] to mean foo at the initial cycle point, and foo[^+PT6H] to mean foo 6 hours after the initial cycle point. Usually, this kind of dependency will only apply in a limited number of cycle points near the start of the suite, so you may want to write it in R1-based cycling sections. Here’s the example inter-cycle R1 suite from above again.

[scheduling]
    [[dependencies]]
        [[[R1]]]
            graph = "prep"
        [[[R1/T00,R1/T12]]]
            graph = "prep[^] => foo"
        [[[T00,T12]]]
            graph = "foo[-PT12H] => foo => bar"

You can see there is a dependence on the initial R1 task prep for foo at the first T00 cycle point, and at the first T12 cycle point. Thereafter, foo just depends on its previous (12 hours ago) instance.

Finally, it is also possible to have a dependency on a task at a specific cycle point.

[scheduling]
    [[dependencies]]
        [[[R1/20200202]]]
            graph = "baz[20200101] => qux"

However, in a long running suite, a repeating cycle should avoid having a dependency on a task with a specific cycle point (including the initial cycle point) - as it can currently cause performance issue. In the following example, all instances of qux will depend on baz.20200101, which will never be removed from the task pool:

[scheduling]
    initial cycle point = 2010
    [[dependencies]]
        # Can cause performance issue!
        [[[P1D]]]
            graph = "baz[20200101] => qux"
9.3.5.12. Special Sequential Tasks

Tasks that depend on their own previous-cycle instance can be declared as sequential:

[scheduling]
    [[special tasks]]
        # foo depends on its previous instance:
        sequential = foo  # deprecated - see below!
    [[dependencies]]
        [[[T00,T12]]]
            graph = "foo => bar"

The sequential declaration is deprecated however, in favor of explicit inter-cycle triggers which clearly expose the same scheduling behaviour in the graph:

[scheduling]
    [[dependencies]]
        [[[T00,T12]]]
            # foo depends on its previous instance:
            graph = "foo[-PT12H] => foo => bar"

The sequential declaration is arguably convenient in one unusual situation though: if a task has a non-uniform cycling sequence then multiple explicit triggers,

[scheduling]
    [[dependencies]]
        [[[T00,T03,T11]]]
            graph = "foo => bar"
        [[[T00]]]
            graph = "foo[-PT13H] => foo"
        [[[T03]]]
            graph = "foo[-PT3H] => foo"
        [[[T11]]]
            graph = "foo[-PT8H] => foo"

can be replaced by a single sequential declaration,

[scheduling]
    [[special tasks]]
        sequential = foo
    [[dependencies]]
        [[[T00,T03,T11]]]
            graph = "foo => bar"
9.3.5.13. Future Triggers

Cylc also supports inter-cycle triggering off tasks “in the future” (with respect to cycle point - which has no bearing on wall-clock job submission time unless the task has a clock trigger):

[[dependencies]]
    [[[T00,T06,T12,T18]]]
        graph = """
    # A runs in this cycle:
            A
    # B in this cycle triggers off A in the next cycle.
            A[PT6H] => B
        """

Future triggers present a problem at suite shutdown rather than at start-up. Here, B at the final cycle point wants to trigger off an instance of A that will never exist because it is beyond the suite stop point. Consequently Cylc prevents tasks from spawning successors that depend on other tasks beyond the final point.

9.3.5.14. Clock Triggers

Note

Please read External Triggers (External Triggers) before using the older clock triggers described in this section.

By default, date-time cycle points are not connected to the real time “wall clock”. They are just labels that are passed to task jobs (e.g. to initialize an atmospheric model run with a particular date-time value). In real time cycling systems, however, some tasks - typically those near the top of the graph in each cycle - need to trigger at or near the time when their cycle point is equal to the real clock date-time.

So clock triggers allow tasks to trigger at (or after, depending on other triggers) a wall clock time expressed as an offset from cycle point:

[scheduling]
    [[special tasks]]
        clock-trigger = foo(PT2H)
    [[dependencies]]
        [[[T00]]]
            graph = foo

Here, foo[2015-08-23T00] would trigger (other dependencies allowing) when the wall clock time reaches 2015-08-23T02. Clock-trigger offsets are normally positive, to trigger some time after the wall-clock time is equal to task cycle point.

Clock-triggers have no effect on scheduling if a suite is running sufficiently far behind the clock (e.g. after a delay, or because it is processing archived historical data) that the trigger times, which are relative to task cycle point, have already passed.

9.3.5.15. Clock-Expire Triggers

Tasks can be configured to expire - i.e. to skip job submission and enter the expired state - if they are too far behind the wall clock when they become ready to run, and other tasks can trigger off this. As a possible use case, consider a cycling task that copies the latest of a set of files to overwrite the previous set: if the task is delayed by more than one cycle there may be no point in running it because the freshly copied files will just be overwritten immediately by the next task instance as the suite catches back up to real time operation. Clock-expire tasks are configured like clock-trigger tasks, with a date-time offset relative to cycle point ([scheduling] -> [[special tasks]] -> clock-expire). The offset should be positive to make the task expire if the wall-clock time has gone beyond the cycle point. Triggering off an expired task typically requires suicide triggers to remove the workflow that runs if the task has not expired. Here a task called copy expires, and its downstream workflow is skipped, if it is more than one day behind the wall-clock (see also etc/examples/clock-expire):

[cylc]
   cycle point format = %Y-%m-%dT%H
[scheduling]
    initial cycle point = 2015-08-15T00
    [[special tasks]]
        clock-expire = copy(-P1D)
    [[dependencies]]
        [[[P1D]]]
            graph = """
        model[-P1D] => model => copy => proc
              copy:expired => !proc"""
9.3.5.16. External Event Triggers

This is a substantial topic, documented in External Triggers.

9.3.6. Model Restart Dependencies

Warm-cycled forecast models generate restart files, e.g. model background fields, to initialize the next forecast. This kind of dependence requires an inter-cycle trigger:

[scheduling]
    [[dependencies]]
        [[[T00,T06,T12,T18]]]
            graph = "A[-PT6H] => A"

If your model is configured to write out additional restart files to allow one or more cycle points to be skipped in an emergency do not represent these potential dependencies in the suite graph as they should not be used under normal circumstances. For example, the following graph would result in task A erroneously triggering off A[T-24] as a matter of course, instead of off A[T-6], because A[T-24] will always be finished first:

[scheduling]
    [[dependencies]]
        [[[T00,T06,T12,T18]]]
            # DO NOT DO THIS (SEE ACCOMPANYING TEXT):
            graph = "A[-PT24H] | A[-PT18H] | A[-PT12H] | A[-PT6H] => A"

9.3.7. How The Graph Determines Task Instantiation

A graph trigger pair like foo => bar determines the existence and prerequisites (dependencies) of the downstream task bar, for the cycle points defined by the associated graph section heading. In general it does not say anything about the dependencies or existence of the upstream task foo. However if the trigger has no cycle point offset Cylc will infer that bar must exist at the same cycle points as foo. This is a convenience to allow this:

graph = "foo => bar"

to be written as shorthand for this:

graph = """foo
           foo => bar"""

(where foo by itself means <nothing> => foo, i.e. the task exists at these cycle points but has no prerequisites - although other prerequisites may be defined for it in other parts of the graph).

Cylc does not infer the existence of the upstream task in offset triggers like foo[-P1D] => bar because, as explained in No Implicit Creation of Tasks by Offset Triggers, a typo in the offset interval should generate an error rather than silently creating tasks on an erroneous cycling sequence.

As a result you need to be careful not to define inter-cycle dependencies that cannot be satisfied at run time. Suite validation catches this kind of error if the existence of the cycle offset task is not defined anywhere at all:

[scheduling]
    initial cycle point = 2020
    [[dependencies]]
        [[[P1Y]]]
            # ERROR
            graph = "foo[-P1Y] => bar"
$ cylc validate SUITE
'ERROR: No cycling sequences defined for foo'

To fix this, use another line in the graph to tell Cylc to define foo at each cycle point:

[scheduling]
    initial cycle point = 2020
    [[dependencies]]
        [[[P1Y]]]
            graph = """
                foo
                foo[-P1Y] => bar"""

But validation does not catch this kind of error if the offset task is defined only on a different cycling sequence:

[scheduling]
    initial cycle point = 2020
    [[dependencies]]
        [[[P2Y]]]
            graph = """foo
                # ERROR
                foo[-P1Y] => bar"""

This suite will validate OK, but it will stall at runtime with bar waiting on foo[-P1Y] at the intermediate years where it does not exist. The offset [-P1Y] is presumably an error (it should be [-P2Y]), or else another graph line is needed to generate foo instances on the yearly sequence:

[scheduling]
    initial cycle point = 2020
    [[dependencies]]
        [[[P1Y]]]
            graph = "foo"
        [[[P2Y]]]
            graph = "foo[-P1Y] => bar"

Similarly the following suite will validate OK, but it will stall at runtime with bar waiting on foo[-P1Y] in every cycle point, when only a single instance of it exists, at the initial cycle point:

[scheduling]
    initial cycle point = 2020
    [[dependencies]]
        [[[R1]]]
            graph = foo
        [[[P1Y]]]
            # ERROR
            graph = foo[-P1Y] => bar

Note

cylc graph will display un-satisfiable inter-cycle dependencies as “ghost nodes”. Fig. 29 is a screenshot of cylc graph displaying the above example with the un-satisfiable task (foo) displayed as a “ghost node”.

_images/ghost-node-example.png

Fig. 29 Screenshot of cylc graph showing one task as a “ghost node”.

9.4. Runtime - Task Configuration

The [runtime] section of a suite configuration configures what to execute (and where and how to execute it) when each task is ready to run, in a multiple inheritance hierarchy of namespaces culminating in individual tasks. This allows all common configuration detail to be factored out and defined in one place.

Any namespace can configure any or all of the items defined in Suite.rc Reference.

Namespaces that do not explicitly inherit from others automatically inherit from the root namespace (below).

Nested namespaces define task families that can be used in the graph as convenient shorthand for triggering all member tasks at once, or for triggering other tasks off all members at once - see Family Triggers. Nested namespaces can be progressively expanded and collapsed in the dependency graph viewer, and in the gcylc graph and text views. Only the first parent of each namespace (as for single-inheritance) is used for suite visualization purposes.

9.4.1. Task and Namespace Names

There are restrictions on names that can be used for tasks and namespaces to ensure all tasks are processed and implemented correctly. Valid names must:

  1. begin with (or consist only of, as single-character task names are allowed) either:
    • an alphanumeric character, i.e. a letter in either upper or lower case (a-z or A-Z), or a digit (0-9);
    • an underscore (_).
  2. otherwise contain only characters from the following options:
    • alphanumeric characters or underscores (as above);
    • any of these additional character symbols:
      • hyphens (-);
      • plus characters (+);
      • percent signs (%);
      • “at” signs (@).
  3. not be so long, typically over 255 characters, as to raise errors from exceeding maximum filename length on the operating system for generated outputs e.g. directories named (in part) after the task they concern.

Warning

Task and namespace names may not contain colons (:), which would preclude use of directory paths involving the registration name in $PATH variables). They also may not contain the dot (.) character, as it will be interpreted as the delimiter separating the task name from an appended cycle point (see Task Identifiers).

Invalid names for tasks or namespaces will be raised as errors by cylc validate.

Note

Task names need not be hardwired into task implementations because task and suite identity can be extracted portably from the task execution environment supplied by the suite server program (Task Execution Environment) - then to rename a task you can just change its name in the suite configuration.

9.4.2. Root - Runtime Defaults

The root namespace, at the base of the inheritance hierarchy, provides default configuration for all tasks in the suite. Most root items are unset by default, but some have default values sufficient to allow test suites to be defined by dependency graph alone. The script item, for example, defaults to code that prints a message then sleeps for between 1 and 15 seconds and exits. Default values are documented with each item in Suite.rc Reference. You can override the defaults or provide your own defaults by explicitly configuring the root namespace.

9.4.3. Defining Multiple Namespaces At Once

If a namespace section heading is a comma-separated list of names then the subsequent configuration applies to each list member. Particular tasks can be singled out at run time using the $CYLC_TASK_NAME variable.

As an example, consider a suite containing an ensemble of closely related tasks that each invokes the same script but with a unique argument that identifies the calling task name:

[runtime]
    [[ENSEMBLE]]
        script = "run-model.sh $CYLC_TASK_NAME"
    [[m1, m2, m3]]
        inherit = ENSEMBLE

For large ensembles template processing can be used to automatically generate the member names and associated dependencies (see Jinja2 and EmPy).

9.4.4. Runtime Inheritance - Single

The following listing of the inherit.single.one example suite illustrates basic runtime inheritance with single parents.

# SUITE.RC
[meta]
    title = "User Guide [runtime] example."
[cylc]
    required run mode = simulation # (no task implementations)
[scheduling]
    initial cycle point = 20110101T06
    final cycle point = 20110102T00
    [[dependencies]]
        [[[T00]]]
            graph = """foo => OBS
                 OBS:succeed-all => bar"""
[runtime]
    [[root]] # base namespace for all tasks (defines suite-wide defaults)
        [[[job]]]
            batch system = at
        [[[environment]]]
            COLOR = red
    [[OBS]]  # family (inherited by land, ship); implicitly inherits root
        script = run-${CYLC_TASK_NAME}.sh
        [[[environment]]]
            RUNNING_DIR = $HOME/running/$CYLC_TASK_NAME
    [[land]] # a task (a leaf on the inheritance tree) in the OBS family
        inherit = OBS
        [[[meta]]]
            description = land obs processing
    [[ship]] # a task (a leaf on the inheritance tree) in the OBS family
        inherit = OBS
        [[[meta]]]
            description = ship obs processing
        [[[job]]]
            batch system = loadleveler
        [[[environment]]]
            RUNNING_DIR = $HOME/running/ship  # override OBS environment
            OUTPUT_DIR = $HOME/output/ship    # add to OBS environment
    [[foo]]
        # (just inherits from root)

    # The task [[bar]] is implicitly defined by its presence in the
    # graph; it is also a dummy task that just inherits from root.

9.4.5. Runtime Inheritance - Multiple

If a namespace inherits from multiple parents the linear order of precedence (which namespace overrides which) is determined by the so-called C3 algorithm used to find the linear method resolution order for class hierarchies in Python and several other object oriented programming languages. The result of this should be fairly obvious for typical use of multiple inheritance in cylc suites, but for detailed documentation of how the algorithm works refer to the official Python documentation.

The inherit.multi.one example suite, listed here, makes use of multiple inheritance:

[meta]
    title = "multiple inheritance example"

    description = """To see how multiple inheritance works:

 % cylc list -tb[m] SUITE # list namespaces
 % cylc graph -n SUITE # graph namespaces
 % cylc graph SUITE # dependencies, collapse on first-parent namespaces

  % cylc get-config --sparse --item [runtime]ops_s1 SUITE
  % cylc get-config --sparse --item [runtime]var_p2 foo"""

[scheduling]
    [[dependencies]]
        graph = "OPS:finish-all => VAR"

[runtime]
    [[root]]
    [[OPS]]
        script = echo "RUN: run-ops.sh"
    [[VAR]]
        script = echo "RUN: run-var.sh"
    [[SERIAL]]
        [[[directives]]]
            job_type = serial
    [[PARALLEL]]
        [[[directives]]]
            job_type = parallel
    [[ops_s1, ops_s2]]
        inherit = OPS, SERIAL

    [[ops_p1, ops_p2]]
        inherit = OPS, PARALLEL
        
    [[var_s1, var_s2]]
        inherit = VAR, SERIAL

    [[var_p1, var_p2]]
        inherit = VAR, PARALLEL

[visualization]
    # NOTE ON VISUALIZATION AND MULTIPLE INHERITANCE: overlapping
    # family groups can have overlapping attributes, so long as 
    # non-conflictling attributes are used to style each group. Below,
    # for example, OPS tasks are filled green and SERIAL tasks are
    # outlined blue, so that ops_s1 and ops_s2 are green with a blue
    # outline. But if the SERIAL tasks are explicitly styled as "not
    # filled" (by setting "style=") this will override the fill setting
    # in the (previously defined and therefore lower precedence) OPS
    # group, making ops_s1 and ops_s2 unfilled with a blue outline.
    # Alternatively you can just create a manual node group for ops_s1
    # and ops_s2 and style them separately.
    [[node groups]]
        #(see comment above:)
        #serial_ops = ops_s1, ops_s2
    [[node attributes]]
        OPS = "style=filled", "fillcolor=green"
        SERIAL = "color=blue" #(see comment above:), "style="
        #(see comment above:)
        #serial_ops = "color=blue", "style=filled", "fillcolor=green"

cylc get-suite-config provides an easy way to check the result of inheritance in a suite. You can extract specific items, e.g.:

$ cylc get-suite-config --item '[runtime][var_p2]script' \
    inherit.multi.one
echo ``RUN: run-var.sh''

or use the --sparse option to print entire namespaces without obscuring the result with the dense runtime structure obtained from the root namespace:

$ cylc get-suite-config --sparse --item '[runtime]ops_s1' inherit.multi.one
script = echo ``RUN: run-ops.sh''
inherit = ['OPS', 'SERIAL']
[directives]
   job_type = serial
9.4.5.1. Suite Visualization And Multiple Inheritance

The first parent inherited by a namespace is also used as the collapsible family group when visualizing the suite. If this is not what you want, you can demote the first parent for visualization purposes, without affecting the order of inheritance of runtime properties:

[runtime]
    [[BAR]]
        # ...
    [[foo]]
        # inherit properties from BAR, but stay under root for visualization:
        inherit = None, BAR

9.4.6. How Runtime Inheritance Works

The linear precedence order of ancestors is computed for each namespace using the C3 algorithm. Then any runtime items that are explicitly configured in the suite configuration are “inherited” up the linearized hierarchy for each task, starting at the root namespace: if a particular item is defined at multiple levels in the hierarchy, the level nearest the final task namespace takes precedence. Finally, root namespace defaults are applied for every item that has not been configured in the inheritance process (this is more efficient than carrying the full dense namespace structure through from root from the beginning).

9.4.7. Task Execution Environment

The task execution environment contains suite and task identity variables provided by the suite server program, and user-defined environment variables. The environment is explicitly exported (by the task job script) prior to executing the task script (see Task Job Submission and Management).

Suite and task identity are exported first, so that user-defined variables can refer to them. Order of definition is preserved throughout so that variable assignment expressions can safely refer to previously defined variables.

Additionally, access to cylc itself is configured prior to the user-defined environment, so that variable assignment expressions can make use of cylc utility commands:

[runtime]
    [[foo]]
        [[[environment]]]
            REFERENCE_TIME = $( cylc util cycletime --offset-hours=6 )
9.4.7.1. User Environment Variables

A task’s user-defined environment results from its inherited [[[environment]]] sections:

[runtime]
    [[root]]
        [[[environment]]]
            COLOR = red
            SHAPE = circle
    [[foo]]
        [[[environment]]]
            COLOR = blue  # root override
            TEXTURE = rough # new variable

This results in a task foo with SHAPE=circle, COLOR=blue, and TEXTURE=rough in its environment.

9.4.7.2. Overriding Environment Variables

When you override inherited namespace items the original parent item definition is replaced by the new definition. This applies to all items including those in the environment sub-sections which, strictly speaking, are not “environment variables” until they are written, post inheritance processing, to the task job script that executes the associated task. Consequently, if you override an environment variable you cannot also access the original parent value:

[runtime]
    [[FOO]]
        [[[environment]]]
            COLOR = red
    [[bar]]
        inherit = FOO
        [[[environment]]]
            tmp = $COLOR        # !! ERROR: $COLOR is undefined here
            COLOR = dark-$tmp   # !! as this overrides COLOR in FOO.

The compressed variant of this, COLOR = dark-$COLOR, is also in error for the same reason. To achieve the desired result you must use a different name for the parent variable:

[runtime]
    [[FOO]]
        [[[environment]]]
            FOO_COLOR = red
    [[bar]]
        inherit = FOO
        [[[environment]]]
            COLOR = dark-$FOO_COLOR  # OK
9.4.7.3. Task Job Script Variables

These are variables that can be referenced (but should not be modified) in a task job script.

The task job script may export the following environment variables:

CYLC_DEBUG                      # Debug mode, true or not defined
CYLC_DIR                        # Location of cylc installation used
CYLC_VERSION                    # Version of cylc installation used

CYLC_CYCLING_MODE               # Cycling mode, e.g. gregorian
CYLC_SUITE_FINAL_CYCLE_POINT    # Final cycle point
CYLC_SUITE_INITIAL_CYCLE_POINT  # Initial cycle point
CYLC_SUITE_NAME                 # Suite name
CYLC_UTC                        # UTC mode, True or False
CYLC_VERBOSE                    # Verbose mode, True or False
TZ                              # Set to "UTC" in UTC mode or not defined

CYLC_SUITE_RUN_DIR              # Location of the suite run directory in
                                # job host, e.g. ~/cylc-run/foo
CYLC_SUITE_DEF_PATH             # Location of the suite configuration directory in
                                # job host, e.g. ~/cylc-run/foo
CYLC_SUITE_HOST                 # Host running the suite process
CYLC_SUITE_OWNER                # User ID running the suite process
CYLC_SUITE_DEF_PATH_ON_SUITE_HOST
                                # Location of the suite configuration directory in
                                # suite host, e.g. ~/cylc-run/foo
CYLC_SUITE_SHARE_DIR            # Suite (or task!) shared directory (see below)
CYLC_SUITE_UUID                 # Suite UUID string
CYLC_SUITE_WORK_DIR             # Suite work directory (see below)

CYLC_TASK_JOB                   # Task job identifier expressed as
                                # CYCLE-POINT/TASK-NAME/SUBMIT-NUM
                                # e.g. 20110511T1800Z/t1/01
CYLC_TASK_CYCLE_POINT           # Cycle point, e.g. 20110511T1800Z
CYLC_TASK_NAME                  # Job's task name, e.g. t1
CYLC_TASK_SUBMIT_NUMBER         # Job's submit number, e.g. 1,
                                # increments with every submit
CYLC_TASK_TRY_NUMBER            # Number of execution tries, e.g. 1
                                # increments with automatic retry-on-fail
CYLC_TASK_ID                    # Task instance identifier expressed as
                                # TASK-NAME.CYCLE-POINT
                                # e.g. t1.20110511T1800Z
CYLC_TASK_LOG_DIR               # Location of the job log directory
                                # e.g. ~/cylc-run/foo/log/job/20110511T1800Z/t1/01/
CYLC_TASK_LOG_ROOT              # The task job file path
                                # e.g. ~/cylc-run/foo/log/job/20110511T1800Z/t1/01/job
CYLC_TASK_WORK_DIR              # Location of task work directory (see below)
                                # e.g. ~/cylc-run/foo/work/20110511T1800Z/t1
CYLC_TASK_NAMESPACE_HIERARCHY   # Linearised family namespace of the task,
                                # e.g. root postproc t1
CYLC_TASK_DEPENDENCIES          # List of met dependencies that triggered the task
                                # e.g. foo.1 bar.1

CYLC_TASK_COMMS_METHOD          # Set to "ssh" if communication method is "ssh"
CYLC_TASK_SSH_LOGIN_SHELL       # With "ssh" communication, if set to "True",
                                # use login shell on suite host

There are also some global shell variables that may be defined in the task job script (but not exported to the environment). These include:

CYLC_FAIL_SIGNALS               # List of signals trapped by the error trap
CYLC_VACATION_SIGNALS           # List of signals trapped by the vacation trap
CYLC_SUITE_WORK_DIR_ROOT        # Root directory above the suite work directory
                                # in the job host
CYLC_TASK_MESSAGE_STARTED_PID   # PID of "cylc message" job started" command
CYLC_TASK_WORK_DIR_BASE         # Alternate task work directory,
                                # relative to the suite work directory
9.4.7.4. Suite Share Directories

A suite share directory is created automatically under the suite run directory as a share space for tasks. The location is available to tasks as $CYLC_SUITE_SHARE_DIR. In a cycling suite, output files are typically held in cycle point sub-directories of the suite share directory.

The top level share and work directory (below) location can be changed (e.g. to a large data area) by a global config setting (see [hosts] -> [[HOST]] -> work directory).

9.4.7.5. Task Work Directories

Task job scripts are executed from within work directories created automatically under the suite run directory. A task can get its own work directory from $CYLC_TASK_WORK_DIR (or simply $PWD if it does not cd elsewhere at runtime). By default the location contains task name and cycle point, to provide a unique workspace for every instance of every task. This can be overridden in the suite configuration, however, to get several tasks to share the same work directory (see [runtime] -> [[__NAME__]] -> work sub-directory).

The top level work and share directory (above) location can be changed (e.g. to a large data area) by a global config setting (see [hosts] -> [[HOST]] -> work directory).

9.4.7.6. Environment Variable Evaluation

Variables in the task execution environment are not evaluated in the shell in which the suite is running prior to submitting the task. They are written in unevaluated form to the job script that is submitted by cylc to run the task (Task Job Scripts) and are therefore evaluated when the task begins executing under the task owner account on the task host. Thus $HOME, for instance, evaluates at run time to the home directory of task owner on the task host.

9.4.8. How Tasks Get Access To The Suite Directory

Tasks can use $CYLC_SUITE_DEF_PATH to access suite files on the task host, and the suite bin directory is automatically added $PATH. If a remote suite configuration directory is not specified the local (suite host) path will be assumed with the local home directory, if present, swapped for literal $HOME for evaluation on the task host.

9.4.9. Remote Task Hosting

If a task declares an owner other than the suite owner and/or a host other than the suite host, cylc will use non-interactive ssh to execute the task on the owner@host account by the configured batch system:

[runtime]
    [[foo]]
        [[[remote]]]
            host = orca.niwa.co.nz
            owner = bob
        [[[job]]]
            batch system = pbs

For this to work:

  • non-interactive ssh is required from the suite host to the remote task accounts.
  • cylc must be installed on task hosts.
    • Optional software dependencies such as graphviz and Jinja2 are not needed on task hosts.
    • If polling task communication is used, there is no other requirement.
    • If SSH task communication is configured, non-interactive ssh is required from the task host to the suite host.
    • If (default) task communication is configured, the task host should have access to the port on the suite host.
  • the suite configuration directory, or some fraction of its content, can be installed on the task host, if needed.

To learn how to give remote tasks access to cylc, see Task Job Access To Cylc.

Tasks running on the suite host under another user account are treated as remote tasks.

Remote hosting, like all namespace settings, can be declared globally in the root namespace, or per family, or for individual tasks.

9.4.9.1. Dynamic Host Selection

Instead of hardwiring host names into the suite configuration you can specify a shell command that prints a hostname, or an environment variable that holds a hostname, as the value of the host config item. See [runtime] -> [[__NAME__]] -> [[[remote]]] -> host.

9.4.9.2. Remote Task Log Directories

Task stdout and stderr streams are written to log files in a suite-specific sub-directory of the suite run directory, as explained in Task stdout And stderr Logs. For remote tasks the same directory is used, but on the task host. Remote task log directories, like local ones, are created on the fly, if necessary, during job submission.

9.5. Visualization

The visualization section of a suite configuration is used to configure suite graphing, principally graph node (task) and edge (dependency arrow) style attributes. Tasks can be grouped for the purpose of applying common style attributes. See Suite.rc Reference for details.

9.5.1. Collapsible Families In Suite Graphs

[visualization]
    collapsed families = family1, family2

Nested families from the runtime inheritance hierarchy can be expanded and collapsed in suite graphs and the gcylc graph view. All families are displayed in the collapsed state at first, unless [visualization]collapsed families is used to single out specific families for initial collapsing.

In the gcylc graph view, nodes outside of the main graph (such as the members of collapsed families) are plotted as rectangular nodes to the right if they are doing anything interesting (submitted, running, failed).

Fig. 30 illustrates successive expansion of nested task families in the namespaces example suite.

_images/inherit-2.png
_images/inherit-3.png
_images/inherit-4.png
_images/inherit-5.png
_images/inherit-6.png
_images/inherit-7.png

Fig. 30 Graphs of the namespaces example suite showing various states of expansion of the nested namespace family hierarchy, from all families collapsed (top left) through to all expanded (bottom right). This can also be done by right-clicking on tasks in the gcylc graph view.

9.6. Parameterized Tasks

Cylc can automatically generate tasks and dependencies by expanding parameterized task names over lists of parameter values. Uses for this include:

  • generating an ensemble of similar model runs
  • generating chains of tasks to process similar datasets
  • replicating an entire workflow, or part thereof, over several runs
  • splitting a long model run into smaller steps or chunks (parameterized cycling)

Note

This can be done with Jinja2 loops too (Jinja2) but parameterization is much cleaner (nested loops can seriously reduce the clarity of a suite configuration).*

9.6.1. Parameter Expansion

Parameter values can be lists of strings, or lists of integers and integer ranges (with inclusive bounds). Numeric values in a list of strings are considered strings. It is not possible to mix strings with integer ranges.

For example:

[cylc]
    [[parameters]]
        # parameters: "ship", "buoy", "plane"
        # default task suffixes: _ship, _buoy, _plane
        obs = ship, buoy, plane

        # parameters: 1, 2, 3, 4, 5
        # default task suffixes: _run1, _run2, _run3, _run4, _run5
        run = 1..5

        # parameters: 1, 3, 5, 7, 9
        # default task suffixes: _idx1, _idx3, _idx5, _idx7, _idx9
        idx = 1..9..2

        # parameters: -11, -1, 9
        # default task suffixes: _idx-11, _idx-01, _idx+09
        idx = -11..9..10

        # parameters: 1, 3, 5, 10, 11, 12, 13
        # default task suffixes: _i01, _i03, _i05, _i10, _i11, _i12, _i13
        i = 1..5..2, 10, 11..13

        # parameters: "0", "1", "e", "pi", "i"
        # default task suffixes: _0, _1, _e, _pi, _i
        item = 0, 1, e, pi, i

        # ERROR: mix strings with int range
        p = one, two, 3..5

Then angle brackets denote use of these parameters throughout the suite configuration. For the values above, this parameterized name:

model<run>  # for run = 1..5

expands to these concrete task names:

model_run1, model_run2, model_run3, model_run4, model_run5

and this parameterized name:

proc<obs>  # for obs = ship, buoy, plane

expands to these concrete task names:

proc_ship, proc_buoy, proc_plane

By default, to avoid any ambiguity, the parameter name appears in the expanded task names for integer values, but not for string values. For example, model_run1 for run = 1, but proc_ship for obs = ship. However, the default expansion templates can be overridden if need be:

[cylc]
    [[parameters]]
        obs = ship, buoy, plane
        run = 1..5
    [[parameter templates]]
        run = -R%(run)s  # Make foo<run> expand to foo-R1 etc.

(See [cylc] -> [[parameter templates]] for more on the string template syntax.)

Any number of parameters can be used at once. This parameterization:

model<run,obs>  # for run = 1..2 and obs = ship, buoy, plane

expands to these tasks names:

model_run1_ship, model_run1_buoy, model_run1_plane,
model_run2_ship, model_run2_buoy, model_run2_plane

Here’s a simple but complete example suite:

[cylc]
    [[parameters]]
        run = 1..2
[scheduling]
    [[dependencies]]
        graph = "prep => model<run>"
[runtime]
    [[model<run>]]
        # ...

The result, post parameter expansion, is this:

[scheduling]
    [[dependencies]]
        graph = "prep => model_run1 & model_run2"
[runtime]
    [[model_run1]]
        # ...
    [[model_run2]]
        # ...

Here’s a more complex graph using two parameters ([runtime] omitted):

[cylc]
    [[parameters]]
        run = 1..2
        mem = cat, dog
[scheduling]
    [[dependencies]]
        graph = """prep => init<run> => model<run,mem> =>
                      post<run,mem> => wrap<run> => done"""

Fig. 31 shows the result as visualized by cylc graph.

_images/params1.png

Fig. 31 Parameter expansion example.

9.6.1.1. Zero-Padded Integer Values

Integer parameter values are given a default template for generating task suffixes that are zero-padded according to the longest size of their values. For example, the default template for p = 9..10 would be _p%(p)02d, so that foo<p> would become foo_p09, foo_p10. If negative values are present in the parameter list, the default template will include the sign. For example, the default template for p = -1..1 would be _p%(p)+02d, so that foo<p> would become foo_p-1, foo_p+0, foo_p+1.

To get thicker padding and/or alternate suffixes, use a template. E.g.:

[cylc]
    [[parameters]]
        i = 1..9
        p = 3..14
    [[parameter templates]]
        i = _i%(i)02d  # suffixes = _i01, _i02, ..., _i09
        # A double-percent gives a literal percent character
        p = %%p%(p)03d  # suffixes = %p003, %p004, ..., %p013, %p014
9.6.1.2. Parameters as Full Task Names

Parameter values can be used as full task names, but the default template should be overridden to remove the initial underscore. For example:

[cylc]
    [[parameters]]
        i = 1..4
        obs = ship, buoy, plane
    [[parameter templates]]
        i = i%(i)d  # task name must begin with an alphabet
        obs = %(obs)s
[scheduling]
    [[dependencies]]
        graph = """
foo => <i>  # foo => i1 & i2 & i3 & i4
<obs> => bar  # ship & buoy & plane => bar
"""

9.6.2. Passing Parameter Values To Tasks

Parameter values are passed as environment variables to tasks generated by parameter expansion. For example, if we have:

[cylc]
    [[parameters]]
        obs = ship, buoy, plane
        run = 1..5
[scheduling]
    [[dependencies]]
        graph = model<run,obs>

Then task model_run2_ship would get the following standard environment variables:

# In a job script of an instance of the "model_run2_ship" task:
export CYLC_TASK_PARAM_run="2"
export CYLC_TASK_PARAM_obs="ship"

These variables allow tasks to determine which member of a parameterized group they are, and so to vary their behaviour accordingly.

You can also define custom variables and string templates for parameter value substitution. For example, if we add this to the above configuration:

[runtime]
    [[model<run,obs>]]
        [[[environment]]]
            MYNAME = %(obs)sy-mc%(obs)sface
            MYFILE = /path/to/run%(run)03d/%(obs)s

Then task model_run2_ship would get the following custom environment variables:

# In a job script of an instance of the "model_run2_ship" task:
export MYNAME MYFILE
MYNAME=shipy-mcshipface
MYFILE=/path/to/run002/ship

9.6.3. Selecting Specific Parameter Values

Specific parameter values can be singled out in the graph and under [runtime] with the notation <p=5> (for example). Here’s how to make a special task trigger off just the first of a set of model runs:

[cylc]
    [[parameters]]
        run = 1..5
[scheduling]
    [[dependencies]]
        graph = """ model<run> => post_proc<run>  # general case
                    model<run=1> => check_first_run """  # special case
[runtime]
    [[model<run>]]
        # config for all "model" runs...
    [[model<run=1>]]
        # special config (if any) for the first model run...
    #...

9.6.4. Selecting Partial Parameter Ranges

The parameter notation does not currently support partial range selection such as foo<p=5..10>, but you can achieve the same result by defining a second parameter that covers the partial range and giving it the same expansion template as the full-range parameter. For example:

[cylc]
    [[parameters]]
        run = 1..10  # 1, 2, ..., 10
        runx = 1..3  # 1, 2, 3
    [[parameter templates]]
        run = _R%(run)02d   # _R01, _R02, ..., _R10
        runx = _R%(runx)02d  # _R01, _R02, _R03
[scheduling]
    [[dependencies]]
        graph = """model<run> => post<run>
                   model<runx> => checkx<runx>"""
[runtime]
    [[model<run>]]
        # ...
    #...

9.6.5. Parameter Offsets In The Graph

A negative offset notation <NAME-1> is interpreted as the previous value in the ordered list of parameter values, while a positive offset is interpreted as the next value. For example, to split a model run into multiple steps with each step depending on the previous one, either of these graphs:

graph = "model<run-1> => model<run>"  # for run = 1, 2, 3
graph = "model<run> => model<run+1>"  # for run = 1, 2, 3

expands to:

graph = """model_run1 => model_run2
           model_run2 => model_run3"""

# or equivalently:

graph = "model_run1 => model_run2 => model_run3"

And this graph:

graph = "proc<size-1> => proc<size>"  # for size = small, big, huge

expands to:

graph = """proc_small => proc_big
           proc_big => proc_huge"""

# or equivalently:

graph = "proc_small => proc_big => proc_huge"

However, a quirk in the current system means that you should avoid mixing conditional logic in these statements. For example, the following will do the unexpected:

graph = foo<m-1> & baz => foo<m>  # for m = cat, dog

currently expands to:

graph = foo_cat & baz => foo_dog

# when users may expect it to be:
#    graph = foo_cat => foo_dog
#    graph = baz => foo_cat & foo_dog

For the time being, writing out the logic explicitly will give you the correct graph.

graph = """foo<m-1> => foo<m>  # for m = cat, dog
           baz => foo<m>"""

9.6.6. Task Families And Parameterization

Task family members can be generated by parameter expansion:

[runtime]
    [[FAM]]
    [[member<r>]]
        inherit = FAM
# Result: family FAM contains member_r1, member_r2, etc.

Family names can be parameterized too, just like task names:

[runtime]
    [[RUN<r>]]
    [[model<r>]]
        inherit = RUN<r>
    [[post_proc<r>]]
        inherit = RUN<r>
# Result: family RUN_r1 contains model_r1 and post_proc_r1,
#         family RUN_r2 contains model_r2 and post_proc_r1, etc.

As described in Family Triggers family names can be used to trigger all members at once:

graph = "foo => FAMILY"

or to trigger off all members:

graph = "FAMILY:succeed-all => bar"

or to trigger off any members:

graph = "FAMILY:succeed-any => bar"

If the members of FAMILY were generated with parameters, you can also trigger them all at once with parameter notation:

graph = "foo => member<m>"

Similarly, to trigger off all members:

graph = "member<m> => bar"
# (member<m>:fail etc., for other trigger types)

Family names are still needed in the graph, however, to succinctly express “succeed-any” triggering semantics, and all-to-all or any-to-all triggering:

graph = "FAM1:succeed-any => FAM2"

(Direct all-to-all and any-to-all family triggering is not recommended for efficiency reasons though - see Efficient Inter-Family Triggering).

For family member-to-member triggering use parameterized members. For example, if family OBS_GET has members get<obs> and family OBS_PROC has members proc<obs> then this graph:

graph = "get<obs> => proc<obs>"  # for obs = ship, buoy, plane

expands to:

get_ship => proc_ship
get_buoy => proc_buoy
get_plane => proc_plane

9.6.7. Parameterized Cycling

Two ways of constructing cycling systems are described and contrasted in Workflows For Cycling Systems. For most purposes use of a proper cycling workflow is recommended, wherein cylc incrementally generates the date-time sequence and extends the workflow, potentially indefinitely, at run time. For smaller systems of finite duration, however, parameter expansion can be used to generate a sequence of pre-defined tasks as a proxy for cycling.

Here’s a cycling workflow of two-monthly model runs for one year, with previous-instance model dependence (e.g. for model restart files):

[scheduling]
    initial cycle point = 2020-01
    final cycle point = 2020-12
    [[dependencies]]
        [[[R1]]]  # Run once, at the initial point.
            graph = "prep => model"
        [[[P2M]]]  # Run at 2-month intervals between the initial and final points.
            graph = "model[-P2M] => model => post_proc & archive"
[runtime]
    [[model]]
        script = "run-model $CYLC_TASK_CYCLE_POINT"

And here’s how to do the same thing with parameterized tasks:

[cylc]
    [[parameters]]
        chunk = 1..6
[scheduling]
    [[dependencies]]
        graph = """prep => model<chunk=1>
                     model<chunk-1> => model<chunk> =>
                       post_proc<chunk> & archive<chunk>"""
[runtime]
    [[model<chunk>]]
        script = """
# Compute start date from chunk index and interval, then run the model.
INITIAL_POINT=2020-01
INTERVAL_MONTHS=2
OFFSET_MONTHS=(( (CYLC_TASK_PARAM_chunk - 1)*INTERVAL_MONTHS ))
OFFSET=P${OFFSET_MONTHS}M  # e.g. P4M for chunk=3
run-model $(cylc cyclepoint --offset=$OFFSET $INITIAL_POINT)"""

The two workflows are shown together in Fig. 32. They both achieve the same result, and both can include special tasks at the start, end, or anywhere in between. But as noted earlier the parameterized version has several disadvantages: it must be finite in extent and not too large; the date-time arithmetic has to be done by the user; and the full extent of the workflow will be visible at all times as the suite runs.

_images/eg2-static.png
_images/eg2-dynamic.png

Fig. 32 Parameterized (top) and cycling (bottom) versions of the same workflow. The first three cycle points are shown in the cycling case. The parameterized case does not have “cycle points”.

Here’s a yearly-cycling suite with four parameterized chunks in each cycle point:

[cylc]
    [[parameters]]
        chunk = 1..4
[scheduling]
    initial cycle point = 2020-01
    [[dependencies]]
        [[[P1Y]]]
            graph = """model<chunk-1> => model<chunk>
                    model<chunk=4>[-P1Y] => model<chunk=1>"""

Note

The inter-cycle trigger connects the first chunk in each cycle point to the last chunk in the previous cycle point. Of course it would be simpler to just use 3-monthly cycling:

[scheduling]
    initial cycle point = 2020-01
    [[dependencies]]
        [[[P3M]]]
            graph = "model[-P3M] => model"

Here’s a possible valid use-case for mixed cycling: consider a portable date-time cycling workflow of model jobs that can each take too long to run on some supported platforms. This could be handled without changing the cycling structure of the suite by splitting the run (at each cycle point) into a variable number of shorter steps, using more steps on less powerful hosts.

9.6.7.1. Cycle Point And Parameter Offsets At Start-Up

In cycling workflows cylc ignores anything earlier than the suite initial cycle point. So this graph:

graph = "model[-P1D] => model"

simplifies at the initial cycle point to this:

graph = "model"

Similarly, parameter offsets are ignored if they extend beyond the start of the parameter value list. So this graph:

graph = "model<chunk-1> => model<chunk>"

simplifies for chunk=1 to this:

graph = "model_chunk1"

Note

The initial cut-off applies to every parameter list, but only to cycle point sequences that start at the suite initial cycle point. Therefore it may be somewhat easier to use parameterized cycling if you need multiple date-time sequences with different start points in the same suite. We plan to allow this sequence-start simplification for any date-time sequence in the future, not just at the suite initial point, but it needs to be optional because delayed-start cycling tasks sometimes need to trigger off earlier cycling tasks.

9.7. Jinja2

Note

This section needs to be revised - the Parameterized Task feature introduced in cylc-6.11.0 (see Parameterized Tasks) provides a cleaner way to auto-generate tasks without coding messy Jinja2 loops.

Cylc has built in support for the Jinja2 template processor in suite configurations. Jinja2 variables, mathematical expressions, loop control structures, conditional logic, etc., are automatically processed to generate the final suite configuration seen by cylc.

The need for Jinja2 processing must be declared with a hash-bang comment as the first line of the suite.rc file:

#!jinja2
# ...

Potential uses for this include automatic generation of repeated groups of similar tasks and dependencies, and inclusion or exclusion of entire suite sections according to the value of a single flag. Consider a large complicated operational suite and several related parallel test suites with slightly different task content and structure (the parallel suites, for instance, might take certain large input files from the operation or the archive rather than downloading them again) - these can now be maintained as a single master suite configuration that reconfigures itself according to the value of a flag variable indicating the intended use.

Template processing is the first thing done on parsing a suite configuration so Jinja2 expressions can appear anywhere in the file (inside strings and namespace headings, for example).

Jinja2 is well documented, so here we just provide an example suite that uses it. The meaning of the embedded Jinja2 code should be reasonably self-evident to anyone familiar with standard programming techniques.

_images/jinja2-ensemble-graph.png

Fig. 33 The Jinja2 ensemble example suite graph.

The jinja2.ensemble example, graphed in Fig. 33, shows an ensemble of similar tasks generated using Jinja2:

#!jinja2
{% set N_MEMBERS = 5 %}
[scheduling]
    [[dependencies]]
        graph = """{# generate ensemble dependencies #}
            {% for I in range( 0, N_MEMBERS ) %}
               foo => mem_{{ I }} => post_{{ I }} => bar
            {% endfor %}"""

Here is the generated suite configuration, after Jinja2 processing:

#!jinja2
[scheduling]
    [[dependencies]]
        graph = """
          foo => mem_0 => post_0 => bar
          foo => mem_1 => post_1 => bar
          foo => mem_2 => post_2 => bar
          foo => mem_3 => post_3 => bar
          foo => mem_4 => post_4 => bar
                """

And finally, the jinja2.cities example uses variables, includes or excludes special cleanup tasks according to the value of a logical flag, and it automatically generates all dependencies and family relationships for a group of tasks that is repeated for each city in the suite. To add a new city and associated tasks and dependencies simply add the city name to list at the top of the file. The suite is graphed, with the New York City task family expanded, in Fig. 34.

#!Jinja2
[meta]
    title = "Jinja2 city suite example."
    description = """
Illustrates use of variables and math expressions, and programmatic
generation of groups of related dependencies and runtime properties."""

{% set HOST = "SuperComputer" %}
{% set CITIES = 'NewYork', 'Philadelphia', 'Newark', 'Houston', 'SantaFe', 'Chicago' %}
{% set CITYJOBS = 'one', 'two', 'three', 'four' %}
{% set LIMIT_MINS = 20 %}

{% set CLEANUP = True %}

[scheduling]
    initial cycle point = 2011-08-08T12
    [[ dependencies ]]
{% if CLEANUP %}
        [[[T23]]]
            graph = "clean"
{% endif %}
        [[[T00,T12]]]
            graph = """
                    setup => get_lbc & get_ic # foo
{% for CITY in CITIES %} {# comment #}
                    get_lbc => {{ CITY }}_one
                    get_ic => {{ CITY }}_two
                    {{ CITY }}_one & {{ CITY }}_two => {{ CITY }}_three & {{ CITY }}_four
{% if CLEANUP %}
                    {{ CITY }}_three & {{ CITY }}_four => cleanup
{% endif %}
{% endfor %}
                    """
[runtime]
    [[on_{{ HOST }} ]]
        [[[remote]]]
            host = {{ HOST }}
            # (remote cylc directory is set in site/user config for this host)
        [[[directives]]]
            wall_clock_limit = "00:{{ LIMIT_MINS|int() + 2 }}:00,00:{{ LIMIT_MINS }}:00"

{% for CITY in CITIES %}
    [[ {{ CITY }} ]]
        inherit = on_{{ HOST }}
{% for JOB in CITYJOBS %}
    [[ {{ CITY }}_{{ JOB }} ]]
        inherit = {{ CITY }}
{% endfor %}
{% endfor %}

[visualization]
    initial cycle point = 2011-08-08T12
    final cycle point = 2011-08-08T23
    [[node groups]]
        cleaning = clean, cleanup
    [[node attributes]]
        cleaning = 'style=filled', 'fillcolor=yellow'
        NewYork = 'style=filled', 'fillcolor=lightblue'        
_images/jinja2-suite-graph.png

Fig. 34 The Jinja2 cities example suite graph, with the New York City task family expanded.

9.7.1. Accessing Environment Variables With Jinja2

This functionality is not provided by Jinja2 by default, but cylc automatically imports the user environment to template’s global namespace (see Custom Jinja2 Filters, Tests and Globals) in a dictionary structure called environ. A usage example:

#!Jinja2
#...
[runtime]
    [[root]]
        [[[environment]]]
            SUITE_OWNER_HOME_DIR_ON_SUITE_HOST = {{environ['HOME']}}

In addition, the following variables are exported to this environment prior to configuration parsing to provide suite context:

CYLC_DEBUG                      # Debug mode, true or not defined
CYLC_DIR                        # Location of cylc installation used
CYLC_VERBOSE                    # Verbose mode, True or False
CYLC_VERSION                    # Version of cylc installation used

CYLC_SUITE_NAME                 # Suite name

CYLC_SUITE_DEF_PATH             # Location of the suite configuration
                                # source path on suite host,
                                # e.g. ~/cylc-run/foo
CYLC_SUITE_LOG_DIR              # Suite log directory.
CYLC_SUITE_RUN_DIR              # Location of the suite run directory in
                                # suite host, e.g. ~/cylc-run/foo
CYLC_SUITE_SHARE_DIR            # Suite (or task post parsing!)
                                # shared directory.
CYLC_SUITE_WORK_DIR             # Suite work directory.

Note

The example above emphasizes that the environment - including the suite context variables - is read on the suite host when the suite configuration is parsed, not at task run time on job hosts.

9.7.2. Custom Jinja2 Filters, Tests and Globals

Jinja2 has three different namespaces used to separate “globals”, “filters” and “tests”. Globals are template-wide accessible variables and functions. Cylc extends this namespace with “environ” dictionary and “raise” and “assert” functions for raising exceptions (see Raising Exceptions).

Filters can be used to modify variable values and are applied using pipe notation. For example, the built-in trim filter strips leading and trailing white space from a string:

{% set MyString = "   dog   " %}
{{ MyString | trim() }}  # "dog"

Additionally, variable values can be tested using “is” keyword followed by the name of the test, e.g. VARIABLE is defined. See official Jinja2 documentation for available built-in globals, filters and tests.

Cylc also supports custom Jinja2 globals, filters and tests. A custom global, filter or test is a single Python function in a source file with the same name as the function (plus “.py” extension) and stored in one of the following locations:

  • <cylc-dir>/lib/Jinja2[namespace]/
  • [suite configuration directory]/Jinja2[namespace]/
  • $HOME/.cylc/Jinja2[namespace]/

where [namespace]/ is one of Globals/, Filters/ or Tests/.

In the argument list of filter or test function, the first argument is the variable value to be “filtered” or “tested”, respectively, and subsequent arguments can be whatever else is needed. Currently there are three custom filters:

9.7.2.1. pad

The “pad” filter is for padding string values to some constant length with a fill character - useful for generating task names and related values in ensemble suites:

{% for i in range(0,100) %}  # 0, 1, ..., 99
    {% set j = i | pad(2,'0') %}
    [[A_{{j}}]]         # [[A_00]], [[A_01]], ..., [[A_99]]
{% endfor %}
9.7.2.2. strftime

The “strftime” filter can be used to format ISO8601 date-time strings using an strftime string.

{% set START_CYCLE = '10661004T08+01' %}
{{ START_CYCLE | strftime('%H') }}  # 00

Examples:

  • {{START_CYCLE | strftime('%Y')}} - 1066
  • {{START_CYCLE | strftime('%m')}} - 10
  • {{START_CYCLE | strftime('%d')}} - 14
  • {{START_CYCLE | strftime('%H:%M:%S %z')}} - 08:00:00 +01

It is also possible to parse non-standard date-time strings by passing a strptime string as the second argument.

Examples:

  • {{'12,30,2000' | strftime('%m', '%m,%d,%Y')}} - 12
  • {{'1066/10/14 08:00:00' | strftime('%Y%m%dT%H', '%Y/%m/%d %H:%M:%S')}} - 10661014T08
9.7.2.3. duration_as

The “duration_as” filter can be used to format ISO8601 duration strings as a floating-point number of several different units. Units for the conversion can be specified in a case-insensitive short or long form:

  • Seconds - “s” or “seconds”
  • Minutes - “m” or “minutes”
  • Hours - “h” or “hours”
  • Days - “d” or “days”
  • Weeks - “w” or “weeks”

Within the suite, this becomes:

{% set CYCLE_INTERVAL = 'PT1D' %}
{{ CYCLE_INTERVAL | duration_as('h') }}  # 24.0
{% set CYCLE_SUBINTERVAL = 'PT30M' %}
{{ CYCLE_SUBINTERVAL | duration_as('hours') }}  # 0.5
{% set CYCLE_INTERVAL = 'PT1D' %}
{{ CYCLE_INTERVAL | duration_as('s') }}  # 86400.0
{% set CYCLE_SUBINTERVAL = 'PT30M' %}
{{ CYCLE_SUBINTERVAL | duration_as('seconds') }}  # 1800.0

While the filtered value is a floating-point number, it is often required to supply an integer to suite entities (e.g. environment variables) that require it. This is accomplished by chaining filters:

  • {{CYCLE_INTERVAL | duration_as('h') | int}} - 24
  • {{CYCLE_SUBINTERVAL | duration_as('h') | int}} - 0
  • {{CYCLE_INTERVAL | duration_as('s') | int}} - 86400
  • {{CYCLE_SUBINTERVAL | duration_as('s') | int}} - 1800

9.7.3. Associative Arrays In Jinja2

Associative arrays (dicts in Python) can be very useful. Here’s an example, from <cylc-dir>/etc/examples/jinja2/dict:

#!Jinja2
{% set obs_types = ['airs', 'iasi'] %}
{% set resource = { 'airs':'ncpus=9', 'iasi':'ncpus=20' } %}

[scheduling]
    [[dependencies]]
        graph = OBS
[runtime]
    [[OBS]]
        [[[job]]]
            batch system = pbs
    {% for i in obs_types %}
    [[ {{i}} ]]
        inherit = OBS
        [[[directives]]]
             -I = {{ resource[i] }}
     {% endfor %}

Here’s the result:

$ cylc get-suite-config -i [runtime][airs]directives SUITE
-I = ncpus=9

9.7.4. Jinja2 Default Values And Template Inputs

The values of Jinja2 variables can be passed in from the cylc command line rather than hardwired in the suite configuration. Here’s an example, from <cylc-dir>/etc/examples/jinja2/defaults:

#!Jinja2

[meta]

    title = "Jinja2 example: use of defaults and external input"

    description = """
The template variable FIRST_TASK must be given on the cylc command line
using --set or --set-file=FILE; two other variables, LAST_TASK and
N_MEMBERS can be set similarly, but if not they have default values."""

{% set LAST_TASK = LAST_TASK | default( 'baz' ) %}
{% set N_MEMBERS = N_MEMBERS | default( 3 ) | int %}

{# input of FIRST_TASK is required - no default #}

[scheduling]
    initial cycle point = 20100808T00
    final cycle point   = 20100816T00
    [[dependencies]]
        [[[0]]]
            graph = """{{ FIRST_TASK }} => ENS
                 ENS:succeed-all => {{ LAST_TASK }}"""
[runtime]
    [[ENS]]
{% for I in range( 0, N_MEMBERS ) %}
    [[ mem_{{ I }} ]]
        inherit = ENS
{% endfor %}

Here’s the result:

$ cylc list SUITE
Jinja2 Template Error
'FIRST_TASK' is undefined
cylc-list foo  failed:  1

$ cylc list --set FIRST_TASK=bob foo
bob
baz
mem_2
mem_1
mem_0

$ cylc list --set FIRST_TASK=bob --set LAST_TASK=alice foo
bob
alice
mem_2
mem_1
mem_0

$ cylc list --set FIRST_TASK=bob --set N_MEMBERS=10 foo
mem_9
mem_8
mem_7
mem_6
mem_5
mem_4
mem_3
mem_2
mem_1
mem_0
baz
bob

Note also that cylc view --set FIRST_TASK=bob --jinja2 SUITE will show the suite with the Jinja2 variables as set.

Note

Suites started with template variables set on the command line will restart with the same settings. However, you can set them again on the cylc restart command line if they need to be overridden.

9.7.5. Jinja2 Variable Scope

Jinja2 variable scoping rules may be surprising. Variables set inside a for loop block, for instance, are not accessible outside of the block, so the following will print # FOO is 0, not # FOO is 9:

{% set FOO = false %}
{% for item in items %}
    {% if item.check_something() %}
        {% set FOO = true %}
    {% endif %}
{% endfor %}
# FOO is {{FOO}}

Jinja2 documentation suggests using alternative constructs like the loop else block or the special loop variable. More complex use cases can be handled using namespace objects which allow propagating of changes across scopes:

{% set ns = namespace(foo=false) %}
{% for item in items %}
    {% if item.check_something() %}
        {% set ns.foo = true %}
    {% endif %}
{% endfor %}
# FOO is {{ns.foo}}

For detail, see Jinja2 Template Designer Documentation > Assignments

9.7.6. Raising Exceptions

Cylc provides two functions for raising exceptions using Jinja2. These exceptions are raised when the suite.rc file is loaded and will prevent a suite from running.

Note

These functions must be contained within {{ Jinja2 blocks as opposed to {% blocks.

9.7.6.1. Raise

The “raise” function will result in an error containing the provided text.

{% if not VARIABLE is defined %}
    {{ raise('VARIABLE must be defined for this suite.') }}
{% endif %}
9.7.6.2. Assert

The “assert” function will raise an exception containing the text provided in the second argument providing that the first argument evaluates as False. The following example is equivalent to the “raise” example above.

{{ assert(VARIABLE is defined, 'VARIABLE must be defined for this suite.') }}

9.7.7. Importing additional Python modules

Jinja2 allows to gather variable and macro definitions in a separate template that can be imported into (and thus shared among) other templates.

{% import "suite-utils.rc" as utils %}
{% from "suite-utils.rc" import VARIABLE as ALIAS %}
{{ utils.VARIABLE is equalto(ALIAS)) }}

Cylc extends this functionality to allow import of arbitrary Python modules.

{% from "itertools" import product %}
[runtime]
{% for group, member in product(['a', 'b'], [0, 1, 2]) %}
    [[{{group}}_{{member}}]]
{% endfor %}

For better clarity and disambiguation Python modules can be prefixed with __python__:

{% from "__python__.itertools" import product %}

9.8. EmPy

In addition to Jinja2, Cylc supports EmPy template processor in suite configurations. Similarly to Jinja2, EmPy provides variables, mathematical expressions, loop control structures, conditional logic, etc., that are expanded to generate the final suite configuration seen by Cylc. See the EmPy documentation for more details on its templating features and how to use them.

Note

EmPy is not bundled with Cylc and must be installed separately. It should be available to Python through standard import em. Please also note that there is another Python package called “em” that provides a conflicting module of the same name. You can run cylc check-software command to check your installation.

The need for EmPy processing must be declared with a hash-bang comment as the first line of the suite.rc file:

#!empy
# ...

An example suite empy.cities demonstrating its use is shown below. It is a translation of jinja2.cities example from Jinja2 and can be directly compared against it.

#!EmPy
[meta]
    title = "EmPy city suite example."
    description = """
Illustrates use of variables and math expressions, and programmatic
generation of groups of related dependencies and runtime properties.
"""

@{
HOST = "SuperComputer"
CITIES = 'NewYork', 'Philadelphia', 'Newark', 'Houston', 'SantaFe', 'Chicago'
CITYJOBS = 'one', 'two', 'three', 'four'
LIMIT_MINS = 20
CLEANUP = True
}

[scheduling]
    initial cycle point = 2011-08-08T12
    [[ dependencies ]]
@[ if CLEANUP ]
        [[[T23]]]
            graph = "clean"
@[ end if ]
        [[[T00,T12]]]
            graph = """
                setup => get_lbc & get_ic # foo
@[ for CITY in CITIES ]@# comment
                get_lbc => @(CITY)_one
                get_ic => @(CITY)_two
                @(CITY)_one & @(CITY)_two => @(CITY)_three & @(CITY)_four
    @[ if CLEANUP ]
                @(CITY)_three & @(CITY)_four => cleanup
    @[ end if ]
@[ end for ]
            """
[runtime]
    [[on_@HOST ]]
        [[[remote]]]
            host = @HOST
            # (remote cylc directory is set in site/user config for this host)
        [[[directives]]]
            wall_clock_limit = "00:@(LIMIT_MINS + 2):00,00:@(LIMIT_MINS):00"

@[ for CITY in CITIES ]
    [[ @(CITY) ]]
        inherit = on_@(HOST)
    @[ for JOB in CITYJOBS ]
    [[ @(CITY)_@(JOB) ]]
        inherit = @CITY
    @[ end for ]
@[ end for ]

@empy.include("./suite-visualization.rc")

For basic usage the difference between Jinja2 and EmPy amounts to a different markup syntax with little else to distinguish them. EmPy might be preferable, however, in cases where more complicated processing logic have to be implemented.

EmPy is a system for embedding Python expressions and statements in template text. It makes the full power of Python language and its ecosystem easily accessible from within the template. This might be desirable for several reasons:

  • no need to learn different language and its idiosyncrasies just for writing template logic
  • availability of lambda functions, list and dictionary comprehensions can make template code smaller and more readable compared to Jinja2
  • natural and straightforward integration with Python package ecosystem
  • no two-language barrier between writing template logic and processing extensions makes it easier to refactor and maintain the template code as its complexity grows - inline pieces of Python code can be gathered into subroutines and eventually into separate modules and packages in a seamless manner.

9.9. Omitting Tasks At Runtime

It is sometimes convenient to omit certain tasks from the suite at runtime without actually deleting their definitions from the suite.

Defining [runtime] properties for tasks that do not appear in the suite graph results in verbose-mode validation warnings that the tasks are disabled. They cannot be used because the suite graph is what defines their dependencies and valid cycle points. Nevertheless, it is legal to leave these orphaned runtime sections in the suite configuration because it allows you to temporarily remove tasks from the suite by simply commenting them out of the graph.

To omit a task from the suite at runtime but still leave it fully defined and available for use (by insertion or cylc submit) use one or both of [scheduling][[special task]] lists, include at start-up or exclude at start-up (documented in [scheduling] -> [[special tasks]] -> include at start-up and [scheduling] -> [[special tasks]] -> exclude at start-up). Then the graph still defines the validity of the tasks and their dependencies, but they are not actually loaded into the suite at start-up. Other tasks that depend on the omitted ones, if any, will have to wait on their insertion at a later time or otherwise be triggered manually.

Finally, with Jinja2 (Jinja2) you can radically alter suite structure by including or excluding tasks from the [scheduling] and [runtime] sections according to the value of a single logical flag defined at the top of the suite.

9.10. Naked Dummy Tasks And Strict Validation

A naked dummy task appears in the suite graph but has no explicit runtime configuration section. Such tasks automatically inherit the default “dummy task” configuration from the root namespace. This is very useful because it allows functional suites to be mocked up quickly for test and demonstration purposes by simply defining the graph. It is somewhat dangerous, however, because there is no way to distinguish an intentional naked dummy task from one generated by typographic error: misspelling a task name in the graph results in a new naked dummy task replacing the intended task in the affected trigger expression; and misspelling a task name in a runtime section heading results in the intended task becoming a dummy task itself (by divorcing it from its intended runtime config section).

To avoid this problem any dummy task used in a real suite should not be naked - i.e. it should have an explicit entry in under the runtime section of the suite configuration, even if the section is empty. This results in exactly the same dummy task behaviour, via implicit inheritance from root, but it allows use of cylc validate --strict to catch errors in task names by failing the suite if any naked dummy tasks are detected.

[1]An OR operator on the right doesn’t make much sense: if “B or C” triggers off A, what exactly should cylc do when A finishes?
[2]In NWP forecast analysis suites parts of the observation processing and data assimilation subsystem will typically also depend on model background fields generated by the previous forecast.

10. Task Implementation

Existing scripts and executables can be used as cylc tasks without modification so long as they return standard exit status - zero on success, non-zero for failure - and do not spawn detaching processes internally (see Avoid Detaching Processes).

10.1. Task Job Scripts

When the suite server program determines that a task is ready to run it generates a job script for the task, and submits it to run (see Task Job Submission and Management).

Job scripts encapsulate configured task runtime settings: script and environment items, if defined, are just concatenated in the order shown in Fig. 35, to make the job script. Everything executes in the same shell, so each part of the script can potentially affect the environment of subsequent parts.

_images/anatomy-of-a-job-script.png

Fig. 35 The order in which task runtime script and environment configuration items are combined, in the same shell, to create a task job script. cylc-env represents Cylc-defined environment variables, and user-env user-defined variables from the task [environment] section. (Note this is not a suite dependency graph).

Task job scripts are written to the suite’s job log directory. They can be printed with cylc cat-log or generated and printed with cylc jobscript.

10.2. Inlined Tasks

Task script items can be multi-line strings of bash code, so many tasks can be entirely inlined in the suite.rc file. For anything more than a few lines of code, however, we recommend using external shell scripts to allow independent testing, re-use, and shell mode editing.

10.3. Task Messages

Tasks messages can be sent back to the suite server program to report completed outputs and arbitrary messages of different severity levels.

Some types of message - in addition to events like task failure - can optionally trigger execution of event handlers in the suite server program (see Task Event Handling).

Normal severity messages are printed to job.out and logged by the suite server program:

cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" \
  "Hello from ${CYLC_TASK_ID}"

CUSTOM severity messages are printed to job.out, logged by the suite server program, and can be used to trigger custom event handlers:

cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" \
  "CUSTOM:data available for ${CYLC_TASK_CYCLE_POINT}"

Custom severity messages and event handlers can be used to signal special events that are neither routine information or an error condition, such as production of a particular data file. Task output messages, used for triggering other tasks, can also be sent with custom severity if need be.

WARNING severity messages are printed to job.err, logged by the suite server program, and can be passed to warning event handlers:

cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" \
  "WARNING:Uh-oh, something's not right here."

CRITICAL severity messages are printed to job.err, logged by the suite server program, and can be passed to critical event handlers:

cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" \
  "CRITICAL:ERROR occurred in process X!"

10.4. Aborting Job Scripts on Error

Task job scripts use set -x to abort on any error, and trap ERR, EXIT, and SIGTERM to send task failed messages back to the suite server program before aborting. Other scripts called from job scripts should therefore abort with standard non-zero exit status on error, to trigger the job script error trap.

To prevent a command that is expected to generate a non-zero exit status from triggering the exit trap, protect it with a control statement such as:

if cmp FILE1 FILE2; then
    :  # success: do stuff
else
    :  # failure: do other stuff
fi

Task job scripts also use set -u to abort on referencing any undefined variable (useful for picking up typos); and set -o pipefail to abort if any part of a pipe fails (by default the shell only returns the exit status of the final command in a pipeline).

10.4.1. Custom Failure Messages

Critical events normally warrant aborting a job script rather than just sending a message. As described just above, exit 1 or any failing command not protected by the surrounding scripting will cause a job script to abort and report failure to the suite server program, potentially triggering a failed task event handler.

For failures detected by the scripting you could send a critical message back before aborting, potentially triggering a critical task event handler:

if ! /bin/false; then
  cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" \
    "CRITICAL:ERROR: /bin/false failed!"
  exit 1
fi

To abort a job script with a custom message that can be passed to a failed task event handler, use the built-in cylc__job_abort shell function:

if ! /bin/false; then
  cylc__job_abort "ERROR: /bin/false failed!"
fi

10.5. Avoid Detaching Processes

If a task script starts background sub-processes and does not wait on them, or internally submits jobs to a batch scheduler and then exits immediately, the detached processes will not be visible to cylc and the task will appear to finish when the top-level script finishes. You will need to modify scripts like this to make them execute all sub-processes in the foreground (or use the shell wait command to wait on them before exiting) and to prevent job submission commands from returning before the job completes (e.g. llsubmit -s for Loadleveler, qsub -sync yes for Sun Grid Engine, and qsub -W block=true for PBS).

If this is not possible - perhaps you don’t have control over the script or can’t work out how to fix it - one alternative approach is to use another task to repeatedly poll for the results of the detached processes:

[scheduling]
    [[dependencies]]
        graph = "model => checker => post-proc"
[runtime]
    [[model]]
        # Uh-oh, this script does an internal job submission to run model.exe:
        script = "run-model.sh"
    [[checker]]
        # Fail and retry every minute (for 10 tries at the most) if model's
        # job.done indicator file does not exist yet.
        script = "[[ ! -f $RUN_DIR/job.done ]] && exit 1"
        [[[job]]]
            execution retry delays = 10 * PT1M

11. Task Job Submission and Management

For the requirements a command, script, or program, must fulfill in order to function as a cylc task, see Task Implementation. This section explains how tasks are submitted by the suite server program when they are ready to run, and how to define new batch system handlers.

When a task is ready cylc generates a job script (see Task Job Scripts). The job script is submitted to run by the batch system chosen for the task. Different tasks can use different batch systems. Like other runtime properties, you can set a suite default batch system and override it for specific tasks or families:

[runtime]
   [[root]] # suite defaults
        [[[job]]]
            batch system = loadleveler
   [[foo]] # just task foo
        [[[job]]]
            batch system = at

11.1. Supported Job Submission Methods

Cylc supports a number of commonly used batch systems. See Custom Job Submission Methods for how to add new job submission methods.

11.1.1. background

Runs task job scripts as Unix background processes.

If an execution time limit is specified for a task, its job will be wrapped by the timeout command.

11.1.2. at

Submits task job scripts to the rudimentary Unix at scheduler. The atd daemon must be running.

If an execution time limit is specified for a task, its job will be wrapped by the timeout command.

11.1.3. loadleveler

Submits task job scripts to loadleveler by the llsubmit command. Loadleveler directives can be provided in the suite.rc file:

[runtime]
    [[my_task]]
        [[[job]]]
            batch system = loadleveler
            execution time limit = PT10M
        [[[directives]]]
            foo = bar
            baz = qux

These are written to the top of the task job script like this:

#!/bin/bash
# DIRECTIVES
# @ foo = bar
# @ baz = qux
# @ wall_clock_limit = 660,600
# @ queue

If restart=yes is specified as a directive for loadleveler, the job will automatically trap SIGUSR1, which loadleveler may use to preempt the job. On trapping SIGUSR1, the job will inform the suite that it has been vacated by loadleveler. This will put it back to the submitted state, until it starts running again.

If execution time limit is specified, it is used to generate the wall_clock_limit directive. The setting is assumed to be the soft limit. The hard limit will be set by adding an extra minute to the soft limit. Do not specify the wall_clock_limit directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.

11.1.4. lsf

Submits task job scripts to IBM Platform LSF by the bsub command. LSF directives can be provided in the suite.rc file:

[runtime]
    [[my_task]]
        [[[job]]]
            batch system = lsf
            execution time limit = PT10M
        [[[directives]]]
            -q = foo

These are written to the top of the task job script like this:

#!/bin/bash
# DIRECTIVES
#BSUB -q = foo
#BSUB -W = 10

If execution time limit is specified, it is used to generate the -W directive. Do not specify the -W directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.

11.1.5. pbs

Submits task job scripts to PBS (or Torque) by the qsub command. PBS directives can be provided in the suite.rc file:

[runtime]
    [[my_task]]
        [[[job]]]
            batch system = pbs
            execution time limit = PT1M
        [[[directives]]]
            -V =
            -q = foo
            -l nodes = 1

These are written to the top of the task job script like this:

#!/bin/bash
# DIRECTIVES
#PBS -V
#PBS -q foo
#PBS -l nodes=1
#PBS -l walltime=60

If execution time limit is specified, it is used to generate the -l walltime directive. Do not specify the -l walltime directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.

11.1.6. moab

Submits task job scripts to the Moab workload manager by the msub command. Moab directives can be provided in the suite.rc file; the syntax is very similar to PBS:

[runtime]
    [[my_task]]
        [[[job]]]
            batch system = moab
            execution time limit = PT1M
        [[[directives]]]
            -V =
            -q = foo
            -l nodes = 1

These are written to the top of the task job script like this:

#!/bin/bash
# DIRECTIVES
#PBS -V
#PBS -q foo
#PBS -l nodes=1
#PBS -l walltime=60

(Moab understands #PBS directives).

If execution time limit is specified, it is used to generate the -l walltime directive. Do not specify the -l walltime directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.

11.1.7. sge

Submits task job scripts to Sun/Oracle Grid Engine by the qsub command. SGE directives can be provided in the suite.rc file:

[runtime]
    [[my_task]]
        [[[job]]]
            batch system = sge
            execution time limit = P1D
        [[[directives]]]
            -cwd =
            -q = foo
            -l h_data = 1024M
            -l h_rt = 24:00:00

These are written to the top of the task job script like this:

#!/bin/bash
# DIRECTIVES
#$ -cwd
#$ -q foo
#$ -l h_data=1024M
#$ -l h_rt=24:00:00

If execution time limit is specified, it is used to generate the -l h_rt directive. Do not specify the -l h_rt directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.

11.1.8. slurm

Submits task job scripts to Simple Linux Utility for Resource Management by the sbatch command. SLURM directives can be provided in the suite.rc file:

[runtime]
    [[my_task]]
        [[[job]]]
            batch system = slurm
            execution time limit = PT1H
        [[[directives]]]
            --nodes = 5
            --account = QXZ5W2

Note

Since not all SLURM commands have a short form, cylc requires the long form directives.

These are written to the top of the task job script like this:

#!/bin/bash
#SBATCH --nodes=5
#SBATCH --time=60:00
#SBATCH --account=QXZ5W2

If execution time limit is specified, it is used to generate the --time directive. Do not specify the --time directive explicitly if execution time limit is specified. Otherwise, the execution time limit known by the suite may be out of sync with what is submitted to the batch system.

Cylc supports heterogeneous Slurm jobs via special numbered directive prefixes that distinguish repeated directives from one another:

[runtime]
    # run two heterogenous job components:
    script = srun sleep 10 : sleep 30
    [[my_task]]
        [[[job]]]
            batch system = slurm
            execution time limit = PT1H
        [[[directives]]]
            --account = QXZ5W2
            hetjob_0_--mem = 1G  # first prefix must be "0"
            hetjob_0_--nodes = 3
            hetjob_1_--mem = 2G
            hetjob_1_--nodes = 6

The resulting formatted directives are:

#!/bin/bash
#SBATCH --time=60:00
#SBATCH --account=QXZ5W2
#SBATCH --mem=1G
#SBATCH --nodes=3
#SBATCH hetjob
#SBATCH --mem=2G
#SBATCH --nodes=6

Note

For older Slurm versions with packjob instead of hetjob, use batch system = slurm_packjob and directive prefixes packjob_0_ etc.

11.1.9. Default Directives Provided

For batch systems that use job file directives (PBS, Loadleveler, etc.) default directives are provided to set the job name, stdout and stderr file paths, and the execution time limit (if specified).

Cylc constructs the job name string using a combination of the task ID and the suite name. PBS fails a job submit if the job name in -N name is too long. For version 12 or below, this is 15 characters. For version 13, this is 236 characters. The default setting will truncate the job name string to 15 characters. If you have PBS 13 at your site, you should modify your site’s global configuration file to allow the job name to be longer. (See also [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> job name length maximum.) For example:

[hosts]
    [[myhpc*]]
        [[[batch systems]]]
            [[[[pbs]]]]
                # PBS 13
                job name length maximum = 236

11.1.10. Directives Section Quirks (PBS, SGE, …)

To specify an option with no argument, such as -V in PBS or -cwd in SGE you must give a null string as the directive value in the suite.rc file.

The left hand side of a setting (i.e. the string before the first equal sign) must be unique. To specify multiple values using an option such as -l option in PBS, SGE, etc., either specify all items in a single line:

-l=select=28:ncpus=36:mpiprocs=18:ompthreads=2:walltime=12:00:00

(Left hand side is -l. A second -l=... line will override the first.)

Or separate the items:

-l select=28
-l ncpus=36
-l mpiprocs=18
-l ompthreads=2
-l walltime=12:00:00

Note

There is no equal sign after -l.

(Left hand sides are now -l select, -l ncpus, etc.)

11.2. Task stdout And stderr Logs

When a task is ready to run cylc generates a filename root to be used for the task job script and log files. The filename containing the task name, cycle point, and a submit number that increments if the same task is re-triggered multiple times:

# task job script:
~/cylc-run/tut/oneoff/basic/log/job/1/hello/01/job
# task stdout:
~/cylc-run/tut/oneoff/basic/log/job/1/hello/01/job.out
# task stderr:
~/cylc-run/tut/oneoff/basic/log/job/1/hello/01/job.err

How the stdout and stderr streams are directed into these files depends on the batch system. The background method just uses appropriate output redirection on the command line, as shown above. The loadleveler method writes appropriate directives to the job script that is submitted to loadleveler.

Cylc obviously has no control over the stdout and stderr output from tasks that do their own internal output management (e.g. tasks that submit internal jobs and direct the associated output to other files). For less internally complex tasks, however, the files referred to here will be complete task job logs.

Some batch systems, such as pbs, redirect a job’s stdout and stderr streams to a separate cache area while the job is running. The contents are only copied to the normal locations when the job completes. This means that cylc cat-log or the gcylc GUI will be unable to find the job’s stdout and stderr streams while the job is running. Some sites with these batch systems are known to provide commands for viewing and/or tail-follow a job’s stdout and stderr streams that are redirected to these cache areas. If this is the case at your site, you can configure cylc to make use of the provided commands by adding some settings to the global site/user config. E.g.:

[hosts]
    [[HOST]]  # <= replace this with a real host name
        [[[batch systems]]]
            [[[[pbs]]]]
                err tailer = qcat -f -e \%(job_id)s
                out tailer = qcat -f -o \%(job_id)s
                err viewer = qcat -e \%(job_id)s
                out viewer = qcat -o \%(job_id)s

11.3. Overriding The Job Submission Command

To change the form of the actual command used to submit a job you do not need to define a new batch system handler; just override the command template in the relevant job submission sections of your suite.rc file:

[runtime]
    [[root]]
        [[[job]]]
            batch system = loadleveler
            # Use '-s' to stop llsubmit returning
            # until all job steps have completed:
            batch submit command template = llsubmit -s %(job)s

As explained in Suite.rc Reference the template’s %(job)s will be substituted by the job file path.

11.4. Job Polling

For supported batch systems, one-way polling can be used to determine actual job status: the suite server program executes a process on the task host, by non-interactive ssh, to interrogate the batch queueing system there, and to read a status file that is automatically generated by the task job script as it runs.

Polling may be required to update the suite state correctly after unusual events such as a machine being rebooted with tasks running on it, or network problems that prevent task messages from getting back to the suite host.

Tasks can be polled on demand by right-clicking on them in gcylc or using the cylc poll command.

Tasks are polled automatically, once, if they timeout while queueing in a batch scheduler and submission timeout is set. (See [runtime] -> [[__NAME__]] -> [[[events]]] for how to configure timeouts).

Tasks are polled multiple times, where necessary, when they exceed their execution time limits. These are normally set with some initial delays to allow the batch systems to kill the jobs. (See [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> execution time limit polling intervals for how to configure the polling intervals).

Any tasks recorded in the submitted or running states at suite restart are automatically polled to determine what happened to them while the suite was down.

Regular polling can also be configured as a health check on tasks submitted to hosts that are known to be flaky, or as the sole method of determining task status on hosts that do not allow task messages to be routed back to the suite host.

To use polling instead of task-to-suite messaging set task communication method = poll in cylc site and user global config (see [hosts] -> [[HOST]] -> task communication method). The default polling intervals can be overridden for all suites there too (see [hosts] -> [[HOST]] -> submission polling intervals and [hosts] -> [[HOST]] -> execution polling intervals), or in specific suite configurations (in which case polling will be done regardless of the task communication method configured for the host; see [runtime] -> [[__NAME__]] -> [[[job]]] -> submission polling intervals and [runtime] -> [[__NAME__]] -> [[[job]]] -> execution polling intervals).

Note that regular polling is not as efficient as task messaging in updating task status, and it should be used sparingly in large suites.

Note

For polling to work correctly, the batch queueing system must have a job listing command for listing your jobs, and that the job listing must display job IDs as they are returned by the batch queueing system submit command. For example, for pbs, moab and sge, the qstat command should list jobs with their IDs displayed in exactly the same format as they are returned by the qsub command.

11.5. Job Killing

For supported batch systems, the suite server program can execute a process on the task host, by non-interactive ssh, to kill a submitted or running job according to its batch system.

Tasks can be killed on demand by right-clicking on them in gcylc or using the cylc kill command.

11.6. Execution Time Limit

You can specify an execution time limit for all supported job submission methods. E.g.:

[runtime]
    [[task-x]]
        [[[job]]]
            execution time limit = PT1H

For tasks running with background or at, their jobs will be wrapped using the timeout command. For all other methods, the relevant time limit directive will be added to their job files.

The execution time limit setting will also inform the suite when a a task job should complete by. If a task job has not reported completing within the specified time, the suite will poll the task job. (The default setting is PT1M, PT2M, PT7M. The accumulated times for these intervals will be roughly 1 minute, 1 + 2 = 3 minutes and 1 + 2 + 7 = 10 minutes after a task job exceeds its execution time limit.)

11.6.1. Execution Time Limit and Execution Timeout

If you specify an execution time limit the execution timeout event handler will only be called if the job has not completed after the final poll (by default, 10 min after the time limit). This should only happen if the submission method you are using is not enforcing wallclock limits (unlikely) or you are unable to contact the machine to confirm the job status.

If you specify an execution timeout and not an execution time limit then the execution timeout event handler will be called as soon as the specified time is reached. The job will also be polled to check its latest status (possibly resulting in an update in its status and the calling of the relevant event handler). This behaviour is deprecated, which users should avoid using.

If you specify an execution timeout and an execution time limit then the execution timeout setting will be ignored.

11.7. Custom Job Submission Methods

Defining a new batch system handler requires a little Python programming. Use the built-in handlers as examples, and read the documentation in lib/cylc/batch_sys_manager.py.

11.7.1. An Example

The following qsub.py module overrides the built-in pbs batch system handler to change the directive prefix from #PBS to #QSUB:

#!/usr/bin/env python2

from cylc.batch_sys_handlers.pbs import PBSHandler

class QSUBHandler(PBSHandler):
    DIRECTIVE_PREFIX = "#QSUB "

BATCH_SYS_HANDLER = QSUBHandler()

If this is in the Python search path (see Where To Put Batch System Handler Modules below) you can use it by name in suite configurations:

[scheduling]
    [[dependencies]]
        graph = "a"
[runtime]
    [[root]]
        [[[job]]]
            batch system = qsub  # <---!
            execution time limit = PT1M
        [[[directives]]]
            -l nodes = 1
            -q = long
            -V =

Generate a job script to see the resulting directives:

$ cylc register test $HOME/test
$ cylc jobscript test a.1 | grep QSUB
#QSUB -e /home/oliverh/cylc-run/my.suite/log/job/1/a/01/job.err
#QSUB -l nodes=1
#QSUB -l walltime=60
#QSUB -o /home/oliverh/cylc-run/my.suite/log/job/1/a/01/job.out
#QSUB -N a.1
#QSUB -q long
#QSUB -V

(Of course this suite will fail at run time because we only changed the directive format, and PBS does not accept #QSUB directives in reality).

11.7.2. Where To Put Batch System Handler Modules

Custom batch system handlers must be installed on suite and job hosts in one of these locations:

  • under SUITE-DEF-PATH/lib/python/
  • under CYLC-PATH/lib/cylc/batch_sys_handlers/
  • or anywhere in $PYTHONPATH

Note

For Rose users: rose suite-run automatically installs SUITE-DEF-PATH/lib/python/ to job hosts).

12. External Triggers

Warning

This is a new capability and its suite configuration interface may change somewhat in future releases - see Current Limitations below in Current Limitations.

External triggers allow tasks to trigger directly off of external events, which is often preferable to implementing long-running polling tasks in the workflow. The triggering mechanism described in this section replaces an older and less powerful one documented in Old-Style External Triggers (Deprecated).

If you can write a Python function to check the status of an external condition or event, the suite server program can call it at configurable intervals until it reports success, at which point dependent tasks can trigger and data returned by the function will be passed to the job environments of those tasks. Functions can be written for triggering off of almost anything, such as delivery of a new dataset, creation of a new entry in a database table, or appearance of new data availability notifications in a message broker.

External triggers are visible in suite visualizations as bare graph nodes (just the trigger names). They are plotted against all dependent tasks, not in a cycle point specific way like tasks. This is because external triggers may or may not be cycle point (or even task name) specific - it depends on the arguments passed to the corresponding trigger functions. For example, if an external trigger does not depend on task name or cycle point it will only be called once - albeit repeatedly until satisfied - for the entire suite run, after which the function result will be remembered for all dependent tasks throughout the suite run.

Several built-in external trigger functions are located in <cylc-dir>/lib/cylc/xtriggers/:

Trigger functions are normal Python functions, with certain constraints as described below in:

12.1. Built-in Clock Triggers

These are more transparent (exposed in the graph) and efficient (shared among dependent tasks) than the older clock triggers described in Clock Triggers. (However we don’t recommend wholesale conversion to the new method yet, until its interface has stabilized - see Current Limitations.)

Clock triggers, unlike other trigger functions, are executed synchronously in the main process. The clock trigger function signature looks like this:

wall_clock(offset=None)

The offset argument is a date-time duration (PT1H is 1 hour) relative to the dependent task’s cycle point (automatically passed to the function via a second argument not shown above).

In the following suite, task foo has a daily cycle point sequence, and each task instance can trigger once the wall clock time has passed its cycle point value by one hour:

[scheduling]
    initial cycle point = 2018-01-01
    [[xtriggers]]
        clock_1 = wall_clock(offset=PT1H):PT10S
    [[dependencies]]
        [[[P1D]]]
            graph = "@clock_1 => foo"
[runtime]
    [[foo]]
        script = run-foo.sh

Notice that the short label clock_1 is used to represent the trigger function in the graph. The function call interval, which determines how often the suite server program checks the clock, is optional. Here it is PT10S (i.e. 10 seconds, which is also the default value).

Argument keywords can be omitted if called in the right order, so the clock_1 trigger can also be declared like this:

[[xtriggers]]
    clock_1 = wall_clock(PT1H)

A zero-offset clock trigger does not need to be declared under the [xtriggers] section:

[scheduling]
    initial cycle point = 2018-01-01
    [[dependencies]]
        [[[P1D]]]
            # zero-offset clock trigger:
            graph = "@wall_clock => foo"
[runtime]
    [[foo]]
        script = run-foo.sh

However, when xtriggers are declared the name used must contain only the letters a to z in upper or lower case and underscores.

12.2. Built-in Suite State Triggers

These can be used instead of the older suite state polling tasks described in Triggering Off Of Tasks In Other Suites for inter-suite triggering - i.e. to trigger local tasks off of remote task statuses or messages in other suites. (However we don’t recommend wholesale conversion to the new method yet, until its interface has stabilized - see Current Limitations.)

The suite state trigger function signature looks like this:

suite_state(suite, task, point, offset=None, status='succeeded',
            message=None, cylc_run_dir=None, debug=False)

The first three arguments are compulsory; they single out the target suite name (suite) task name (task) and cycle point (point). The function arguments mirror the arguments and options of the cylc suite-state command - see cylc suite-state --help for documentation.

As a simple example, consider the suites in <cylc-dir>/etc/dev-suites/xtrigger/suite_state/. The “upstream” suite (which we want to trigger off of) looks like this:

[cylc]
    cycle point format = %Y
[scheduling]
    initial cycle point = 2005
    final cycle point = 2015
   [[dependencies]]
      [[[P1Y]]]
          graph = "foo => bar"
[runtime]
   [[bar]]
      script = sleep 10
   [[foo]]
      script = sleep 5; cylc message "data ready"
      [[[outputs]]]
          x = "data ready"

It must be registered and run under the name up, as referenced in the “downstream” suite that depends on it:

[cylc]
  cycle point format = %Y
[scheduling]
    initial cycle point = 2010
    [[xtriggers]]
         upstream = suite_state(suite=up, task=foo, point=%(point)s, \
            message='data ready'):PT10S
         clock_0 = wall_clock(offset=PT0H)
   [[dependencies]]
        [[[P1Y]]]
           graph = """
              foo
              @clock_0 & @upstream => FAM:succeed-all => blam
                   """
[runtime]
    [[root]]
        script = sleep 5
    [[foo, blam]]
    [[FAM]]
    [[f1,f2,f3]]
        inherit = FAM

Try starting the downstream suite first, then the upstream, and watch what happens. In each cycle point the @upstream trigger in the downstream suite waits on the task foo (with the same cycle point) in the upstream suite to emit the data ready message.

Some important points to note about this:

  • the function call interval, which determines how often the suite server program checks the clock, is optional. Here it is PT10S (i.e. 10 seconds, which is also the default value).
  • the suite_state trigger function, like the cylc suite-state command, must have read-access to the upstream suite’s public database.
  • the cycle point argument is supplied by a string template %(point)s. The string templates available to trigger function arguments are described in Custom Trigger Functions).

The return value of the suite_state trigger function looks like this:

results = {
    'suite': suite,
    'task': task,
    'point': point,
    'offset': offset,
    'status': status,
    'message': message,
    'cylc_run_dir': cylc_run_dir
}
return (satisfied, results)

The satisified variable is boolean (value True or False, depending on whether or not the trigger condition was found to be satisfied). The results dictionary contains the names and values of all of the target suite state parameters. Each item in it gets qualified with the unique trigger label (“upstream” here) and passed to the environment of dependent task jobs (the members of the FAM family in this case). To see this, take a look at the job script for one of the downstream tasks:

% cylc cat-log -f j dn f2.2011
...
cylc__job__inst__user_env() {
    # TASK RUNTIME ENVIRONMENT:
    export upstream_suite upstream_cylc_run_dir upstream_offset \
      upstream_message upstream_status upstream_point upstream_task
    upstream_suite="up"
    upstream_cylc_run_dir="/home/vagrant/cylc-run"
    upstream_offset="None"
    upstream_message="data ready"
    upstream_status="succeeded"
    upstream_point="2011"
    upstream_task="foo"}
...

Note

The task has to know the name (label) of the external trigger that it depends on - “upstream” in this case - in order to use this information. However the name could be given to the task environment in the suite configuration.

12.3. Custom Trigger Functions

Trigger functions are just normal Python functions, with a few special properties:

  • they must:
    • be defined in a module with the same name as the function;
    • be compatible with the same Python version that runs the Cylc workflow server program (see Installation for the latest version specification).
  • they can be located in:
    • <cylc-dir>/lib/cylc/xtriggers/;
    • <suite-dir>/lib/python/;
    • or anywhere in your Python library path.
  • they can take arbitrary positional and keyword arguments
  • suite and task identity, and cycle point, can be passed to trigger functions by using string templates in function arguments (see below)
  • integer, float, boolean, and string arguments will be recognized and passed to the function as such
  • if a trigger function depends on files or directories (for example) that might not exist when the function is first called, just return unsatisified until everything required does exist.

Note

Trigger functions cannot store data Pythonically between invocations because each call is executed in an independent process in the process pool. If necessary the filesystem can be used for this purpose.

The following string templates are available for use, if the trigger function needs any of this information, in function arguments in the suite configuration:

  • %(name)s - name of the dependent task
  • %(id)s - identity of the dependent task (name.cycle-point)
  • %(point)s - cycle point of the dependent task
  • %(debug)s - suite debug mode

and less commonly needed:

  • %(user_name)s - suite owner’s user name
  • %(suite_name)s - registered suite name
  • %(suite_run_dir)s - suite run directory
  • %(suite_share_dir)s - suite share directory

If you need to pass a string template into an xtrigger function as a string literal - i.e. to be used as a template inside the function - escape it with % to avoid detection by the Cylc xtrigger parser: %%(cat)s.

Function return values should be as follows:

  • if the trigger condition is not satisfied:
    • return (False, {})
  • if the trigger condition is satisfied:
    • return (True, results)

where results is an arbitrary dictionary of information to be passed to dependent tasks, which in terms of format must:

  • be flat (non-nested);
  • contain only keys which are valid as environment variable names.

See Built-in Suite State Triggers for an example of one such results dictionary and how it gets processed by the suite.

The suite server program manages trigger functions as follows:

  • they are called asynchronously in the process pool - (except for clock triggers, which are called from the main process)
  • they are called repeatedly on a configurable interval, until satisified - the call interval defaults to PT10S (10 seconds) - repeat calls are not made until the previous call has returned
  • they are subject to the normal process pool command time out - if they take too long to return, the process will be killed
  • they are shared for efficiency: a single call will be made for all triggers that share the same function signature - i.e.the same function name and arguments
  • their return status and results are stored in the suite DB and persist across suite restarts
  • their stdout, if any, is redirected to stderr and will be visible in the suite log in debug mode (stdout is needed to communicate return values from the sub-process in which the function executes)

12.3.1. Toy Examples

A couple of toy examples in <cylc-dir>/lib/cylc/xtriggers/ may be a useful aid to understanding trigger functions and how they work.

12.3.1.1. echo

The echo function is a trivial one that takes any number of positional and keyword arguments (from the suite configuration) and simply prints them to stdout, and then returns False (i.e. trigger condition not satisfied). Here it is in its entirety.

def echo(*args, **kwargs):
    print "echo: ARGS:", args
    print "echo: KWARGS:", kwargs
    return (False, {})

Here’s an example echo trigger suite:

[scheduling]
    initial cycle point = now
    [[xtriggers]]
        echo_1 = echo(hello, 99, qux=True, point=%(point)s, foo=10)
    [[dependencies]]
        [[[PT1H]]]
            graph = "@echo_1 => foo"
[runtime]
    [[foo]]
        script = exit 1

To see the result, run this suite in debug mode and take a look at the suite log (or run cylc run --debug --no-detach <suite> and watch your terminal).

12.3.1.2. xrandom

The xrandom function sleeps for a configurable amount of time (useful for testing the effect of a long-running trigger function - which should be avoided) and has a configurable random chance of success. The function signature is:

xrandom(percent, secs=0, _=None, debug=False)

The percent argument sets the odds of success in any given call; secs is the number of seconds to sleep before returning; and the _ argument (underscore is a conventional name for a variable that is not used, in Python) is provided to allow specialization of the trigger to (for example) task name, task ID, or cycle point (just use the appropriate string templates in the suite configuration for this).

An example xrandom trigger suite is <cylc-dir>/etc/dev-suites/xtriggers/xrandom/.

12.4. Current Limitations

The following issues may be addressed in future Cylc releases:

  • trigger labels cannot currently be used in conditional (OR) expressions in the graph; attempts to do so will fail validation.
  • aside from the predefined zero-offset wall_clock trigger, all unique trigger function calls must be declared with all of their arguments under the [scheduling][xtriggers] section, and referred to by label alone in the graph. It would be convenient (and less verbose, although no more functional) if we could just declare a label against the common arguments, and give remaining arguments (such as different wall clock offsets in clock triggers) as needed in the graph.
  • we may move away from the string templating method for providing suite and task attributes to trigger function arguments.

12.5. Filesystem Events?

Cylc does not have built-in support for triggering off of filesystem events such as inotify on Linux. There is no cross-platform standard for this, and in any case filesystem events are not very useful in HPC cluster environments where events can only be detected at the specific node on which they were generated.

12.6. Continuous Event Watchers?

For some applications a persistent process that continually monitors the external world is better than discrete periodic checking. This would be more difficult to support as a plugin mechanism in Cylc, but we may decide to do it in the future. In the meantime, consider implementing a small daemon process as the watcher (e.g. to watch continuously for filesystem events) and have your Cylc trigger functions interact with it.

12.7. Old-Style External Triggers (Deprecated)

Note

This mechanism is now technically deprecated by the newer external trigger functions (External Triggers). (However we don’t recommend wholesale conversion to the new method yet, until its interface has stabilized - see Current Limitations.)

These old-style external triggers are hidden task prerequisites that must be satisfied by using the cylc ext-trigger client command to send an associated pre-defined event message to the suite along with an ID string that distinguishes one instance of the event from another (the name of the target task and its current cycle point are not required). The event ID is just an arbitrary string to Cylc, but it can be used to identify something associated with the event to the suite - such as the filename of a new externally-generated dataset. When the suite server program receives the event notification it will trigger the next instance of any task waiting on that trigger (whatever its cycle point) and then broadcast (see Cylc Broadcast) the event ID to the cycle point of the triggered task as $CYLC_EXT_TRIGGER_ID. Downstream tasks with the same cycle point therefore know the new event ID too and can use it, if they need to, to identify the same new dataset. In this way a whole workflow can be associated with each new dataset, and multiple datasets can be processed in parallel if they happen to arrive in quick succession.

An externally-triggered task must register the event it waits on in the suite scheduling section:

# suite "sat-proc"
[scheduling]
    cycling mode = integer
    initial cycle point = 1
    [[special tasks]]
        external-trigger = get-data("new sat X data avail")
    [[dependencies]]
        [[[P1]]]
            graph = get-data => conv-data => products

Then, each time a new dataset arrives the external detection system should notify the suite like this:

$ cylc ext-trigger sat-proc "new sat X data avail" passX12334a

where “sat-proc” is the suite name and “passX12334a” is the ID string for the new event. The suite passphrase must be installed on triggering account.

Note

Only one task in a suite can trigger off a particular external message. Other tasks can trigger off the externally triggered task as required, of course.

<cylc-dir>/etc/examples/satellite/ext-triggers/suite.rc is a working example of a simulated satellite processing suite.

External triggers are not normally needed in date-time cycling suites driven by real time data that comes in at regular intervals. In these cases a data retrieval task can be clock-triggered (and have appropriate retry intervals) to submit at the expected data arrival time, so little time is wasted in polling. However, if the arrival time of the cycle-point-specific data is highly variable, external triggering may be used with the cycle point embedded in the message:

# suite "data-proc"
[scheduling]
    initial cycle point = 20150125T00
    final cycle point   = 20150126T00
    [[special tasks]]
        external-trigger = get-data("data arrived for $CYLC_TASK_CYCLE_POINT")
    [[dependencies]]
        [[[T00]]]
            graph = init-process => get-data => post-process

Once the variable-length waiting is finished, an external detection system should notify the suite like this:

$ cylc ext-trigger data-proc "data arrived for 20150126T00" passX12334a

where “data-proc” is the suite name, the cycle point has replaced the variable in the trigger string, and “passX12334a” is the ID string for the new event. The suite passphrase must be installed on the triggering account. In this case, the event will trigger for the second cycle point but not the first because of the cycle-point matching.

13. Running Suites

This chapter currently features a diverse collection of topics related to running suites. Please also see Tutorial and Command Reference, and experiment with plenty of examples.

13.1. Suite Start-Up

There are three ways to start a suite running: cold start and warm start, which start from scratch; and restart, which starts from a prior suite state checkpoint. The only difference between cold starts and warm starts is that warm starts start from a point beyond the suite initial cycle point.

Once a suite is up and running it is typically a restart that is needed most often (but see also cylc reload). Be aware that cold and warm starts wipe out prior suite state, so you can’t go back to a restart if you decide you made a mistake.

13.1.1. Cold Start

A cold start is the primary way to start a suite run from scratch:

$ cylc run SUITE [INITIAL_CYCLE_POINT]

The initial cycle point may be specified on the command line or in the suite.rc file. The scheduler starts by loading the first instance of each task at the suite initial cycle point, or at the next valid point for the task.

13.1.2. Warm Start

A warm start runs a suite from scratch like a cold start, but from the beginning of a given cycle point that is beyond the suite initial cycle point. This is generally inferior to a restart (which loads a previously recorded suite state - see Restart and Suite State Checkpoints) because it may result in some tasks rerunning. However, a warm start may be required if a restart is not possible, e.g. because the suite run database was accidentally deleted. The warm start cycle point must be given on the command line:

$ cylc run --warm SUITE [START_CYCLE_POINT]

The original suite initial cycle point is preserved, but all tasks and dependencies before the given warm start cycle point are ignored.

The scheduler starts by loading a first instance of each task at the warm start cycle point, or at the next valid point for the task. R1-type tasks behave exactly the same as other tasks - if their cycle point is at or later than the given start cycle point, they will run; if not, they will be ignored.

13.1.3. Restart and Suite State Checkpoints

At restart (see cylc restart --help) a suite server program initializes its task pool from a previously recorded checkpoint state. By default the latest automatic checkpoint - which is updated with every task state change - is loaded so that the suite can carry on exactly as it was just before being shut down or killed.

$ cylc restart SUITE

Tasks recorded in the “submitted” or “running” states are automatically polled (see Task Job Polling) at start-up to determine what happened to them while the suite was down.

13.1.3.1. Restart From Latest Checkpoint

To restart from the latest checkpoint simply invoke the cylc restart command with the suite name (or select “restart” in the GUI suite start dialog window):

$ cylc restart SUITE
13.1.3.2. Restart From Another Checkpoint

Suite server programs automatically update the “latest” checkpoint every time a task changes state, and at every suite restart, but you can also take checkpoints at other times. To tell a suite server program to checkpoint its current state:

$ cylc checkpoint SUITE-NAME CHECKPOINT-NAME

The 2nd argument is a name to identify the checkpoint later with:

$ cylc ls-checkpoints SUITE-NAME

For example, with checkpoints named “bob”, “alice”, and “breakfast”:

$ cylc ls-checkpoints SUITE-NAME
#######################################################################
# CHECKPOINT ID (ID|TIME|EVENT)
1|2017-11-01T15:48:34+13|bob
2|2017-11-01T15:48:47+13|alice
3|2017-11-01T15:49:00+13|breakfast
...
0|2017-11-01T17:29:19+13|latest

To see the actual task state content of a given checkpoint ID (if you need to), for the moment you have to interrogate the suite DB, e.g.:

$ sqlite3 ~/cylc-run/SUITE-NAME/log/db \
    'select * from task_pool_checkpoints where id == 3;'
3|2012|model|1|running|
3|2013|pre|0|waiting|
3|2013|post|0|waiting|
3|2013|model|0|waiting|
3|2013|upload|0|waiting|

Note

A checkpoint captures the instantaneous state of every task in the suite, including any tasks that are currently active, so you may want to be careful where you do it. Tasks recorded as active are polled automatically on restart to determine what happened to them.

The checkpoint ID 0 (zero) is always used for latest state of the suite, which is updated continuously as the suite progresses. The checkpoint IDs of earlier states are positive integers starting from 1, incremented each time a new checkpoint is stored. Currently suites automatically store checkpoints before and after reloads, and on restarts (using the latest checkpoints before the restarts).

Once you have identified the right checkpoint, restart the suite like this:

$ cylc restart --checkpoint=CHECKPOINT-ID SUITE

or enter the checkpoint ID in the space provided in the GUI restart window.

13.1.3.3. Checkpointing With A Task

Checkpoints can be generated automatically at particular points in the workflow by coding tasks that run the cylc checkpoint command:

[scheduling]
   [[dependencies]]
      [[[PT6H]]]
          graph = "pre => model => post => checkpointer"
[runtime]
   # ...
   [[checkpointer]]
      script = """
wait "${CYLC_TASK_MESSAGE_STARTED_PID}" 2>/dev/null || true
cylc checkpoint ${CYLC_SUITE_NAME} CP-${CYLC_TASK_CYCLE_POINT}
               """

Note

We need to “wait” on the “task started” message - which is sent in the background to avoid holding tasks up in a network outage - to ensure that the checkpointer task is correctly recorded as running in the checkpoint (at restart the suite server program will poll to determine that that task job finished successfully). Otherwise it may be recorded in the waiting state and, if its upstream dependencies have already been cleaned up, it will need to be manually reset from waiting to succeeded after the restart to avoid stalling the suite.

13.1.3.4. Behaviour of Tasks on Restart

All tasks are reloaded in exactly their checkpointed states. Failed tasks are not automatically resubmitted at restart in case the underlying problem has not been addressed yet.

Tasks recorded in the submitted or running states are automatically polled on restart, to see if they are still waiting in a batch queue, still running, or if they succeeded or failed while the suite was down. The suite state will be updated automatically according to the poll results.

Existing instances of tasks removed from the suite configuration before restart are not removed from the task pool automatically, but they will not spawn new instances. They can be removed manually if necessary, with~``cylc remove``.

Similarly, instances of new tasks added to the suite configuration before restart are not inserted into the task pool automatically, because it is very difficult in general to automatically determine the cycle point of the first instance. Instead, the first instance of a new task should be inserted manually at the right cycle point, with cylc insert.

13.2. Reloading The Suite Configuration At Runtime

The cylc reload command tells a suite server program to reload its suite configuration at run time. This is an alternative to shutting a suite down and restarting it after making changes.

As for a restart, existing instances of tasks removed from the suite configuration before reload are not removed from the task pool automatically, but they will not spawn new instances. They can be removed manually if necessary, with cylc remove.

Similarly, instances of new tasks added to the suite configuration before reload are not inserted into the pool automatically. The first instance of each must be inserted manually at the right cycle point, with cylc insert.

13.3. Task Job Access To Cylc

Task jobs need access to Cylc on the job host, primarily for task messaging, but also to allow user-defined task scripting to run other Cylc commands.

Cylc should be installed on job hosts as on suite hosts, with different releases installed side-by-side and invoked via the central Cylc wrapper according to the value of $CYLC_VERSION - see Installing Cylc. Task job scripts set $CYLC_VERSION to the version of the parent suite server program, so that the right Cylc will be invoked by jobs on the job host.

Access to the Cylc executable (preferably the central wrapper as just described) for different job hosts can be configured using site and user global configuration files (on the suite host). If the environment for running the Cylc executable is only set up correctly in a login shell for a given host, you can set [hosts][HOST]use login shell = True for the relevant host (this is the default, to cover more sites automatically). If the environment is already correct without the login shell, but the Cylc executable is not in $PATH, then [hosts][HOST]cylc executable can be used to specify the direct path to the executable.

To customize the environment more generally for Cylc on jobs hosts, use of job-init-env.sh is described in Configure Environment on Job Hosts.

13.4. The Suite Contact File

At start-up, suite server programs write a suite contact file $HOME/cylc-run/SUITE/.service/contact that records suite host, user, port number, process ID, Cylc version, and other information. Client commands can read this file, if they have access to it, to find the target suite server program.

13.5. Task Job Polling

At any point after job submission task jobs can be polled to check that their true state conforms to what is currently recorded by the suite server program. See cylc poll --help for how to poll one or more tasks manually, or right-click poll a task or family in GUI.

Polling may be necessary if, for example, a task job gets killed by the untrappable SIGKILL signal (e.g. kill -9 PID), or if a network outage prevents task success or failure messages getting through, or if the suite server program itself is down when tasks finish execution.

To poll a task job the suite server program interrogates the batch system, and the job.status file, on the job host. This information is enough to determine the final task status even if the job finished while the suite server program was down or unreachable on the network.

13.5.1. Routine Polling

Task jobs are automatically polled at certain times: once on job submission timeout; several times on exceeding the job execution time limit; and at suite restart any tasks recorded as active in the suite state checkpoint are polled to find out what happened to them while the suite was down.

Finally, in necessary routine polling can be configured as a way to track job status on job hosts that do not allow networking routing back to the suite host for task messaging by HTTPS or ssh. See Polling to Track Job Status.

13.6. Tracking Task State

Cylc supports three ways of tracking task state on job hosts:

  • task-to-suite messaging via HTTPS
  • task-to-suite messaging via non-interactive ssh to the suite host, then local HTTPS
  • regular polling by the suite server program

These can be configured per job host in the Cylc global config file - see Global (Site, User) Config File Reference.

If your site prohibits HTTPS and ssh back from job hosts to suite hosts, before resorting to the polling method you should consider installing dedicated Cylc servers or VMs inside the HPC trust zone (where HTTPS and ssh should be allowed).

It is also possible to run Cylc suite server programs on HPC login nodes, but this is not recommended for load, run duration, and GUI reasons.

Finally, it has been suggested that port forwarding may provide another solution - but that is beyond the scope of this document.

13.6.1. HTTPS Task Messaging

Task job wrappers automatically invoke cylc message to report progress back to the suite server program when they begin executing, at normal exit (success) and abnormal exit (failure).

By default the messaging occurs via an authenticated, HTTPS connection to the suite server program. This is the preferred task communications method - it is efficient and direct.

Suite server programs automatically install suite contact information and credentials on job hosts. Users only need to do this manually for remote access to suites on other hosts, or suites owned by other users - see Remote Control.

13.6.2. Ssh Task Messaging

Cylc can be configured to re-invoke task messaging commands on the suite host via non-interactive ssh (from job host to suite host). Then a local HTTPS connection is made to the suite server program.

(User-invoked client commands (aside from the GUI, which requires HTTPS) can do the same thing with the --use-ssh command option).

This is less efficient than direct HTTPS messaging, but it may be useful at sites where the HTTPS ports are blocked but non-interactive ssh is allowed.

13.6.3. Polling to Track Job Status

Finally, suite server programs can actively poll task jobs at configurable intervals, via non-interactive ssh to the job host.

Polling is the least efficient task communications method because task state is updated only at intervals, not when task events actually occur. However, it may be needed at sites that do not allow HTTPS or non-interactive ssh from job host to suite host.

Be careful to avoid spamming task hosts with polling commands. Each poll opens (and then closes) a new ssh connection.

Polling intervals are configurable under [runtime] because they should may depend on the expected execution time. For instance, a task that typically takes an hour to run might be polled every 10 minutes initially, and then every minute toward the end of its run. Interval values are used in turn until the last value, which is used repeatedly until finished:

[runtime]
    [[foo]]
        [[[job]]]
            # poll every minute in the 'submitted' state:
            submission polling intervals = PT1M
            # poll one minute after foo starts running, then every 10
            # minutes for 50 minutes, then every minute until finished:
            execution polling intervals = PT1M, 5*PT10M, PT1M

A list of intervals with optional multipliers can be used for both submission and execution polling, although a single value is probably sufficient for submission polling. If these items are not configured default values from site and user global config will be used for the polling task communication method; polling is not done by default under the other task communications methods (but it can still be used if you like).

13.6.4. Task Communications Configuration

13.7. The Suite Service Directory

At registration time a suite service directory, $HOME/cylc-run/<SUITE>/.service/, is created and populated with a private passphrase file (containing random text), a self-signed SSL certificate (see Client-Server Interaction), and a symlink to the suite source directory. An existing passphrase file will not be overwritten if a suite is re-registered.

At run time, the private suite run database is also written to the service directory, along with a suite contact file that records the host, user, port number, process ID, Cylc version, and other information about the suite server program. Client commands automatically read daemon targetting information from the contact file, if they have access to it.

13.8. File-Reading Commands

Some Cylc commands and GUI actions parse suite configurations or read other files from the suite host account, rather than communicate with a suite server program over the network. In future we plan to have suite server program serve up these files to clients, but for the moment this functionality requires read-access to the relevant files on the suite host.

If you are logged into the suite host account, file-reading commands will just work.

13.8.1. Remote Host, Shared Home Directory

If you are logged into another host with shared home directories (shared filesystems are common in HPC environments) file-reading commands will just work because suite files will look “local” on both hosts.

13.8.2. Remote Host, Different Home Directory

If you are logged into another host with no shared home directory, file-reading commands require non-interactive ssh to the suite host account, and use of the --host and --user options to re-invoke the command on the suite account.

13.8.3. Same Host, Different User Account

(This is essentially the same as Remote Host, Different Home Directory.)

13.9. Client-Server Interaction

Cylc server programs listen on dedicated network ports for HTTPS communications from Cylc clients (task jobs, and user-invoked commands and GUIs).

Use cylc scan to see which suites are listening on which ports on scanned hosts (this lists your own suites by default, but it can show others too - see cylc scan --help).

Cylc supports two kinds of access to suite server programs:

13.9.1. Public Access - No Auth Files

Without a suite passphrase the amount of information revealed by a suite server program is determined by the public access privilege level set in global site/user config ([authentication]) and optionally overidden in suites ([cylc] -> [[authentication]]):

  • identity - only suite and owner names revealed
  • description - identity plus suite title and description
  • state-totals - identity, description, and task state totals
  • full-read - full read-only access for monitor and GUI
  • shutdown - full read access plus shutdown, but no other control.

The default public access level is state-totals.

The cylc scan command and the cylc gscan GUI can print descriptions and task state totals in addition to basic suite identity, if the that information is revealed publicly.

13.9.2. Full Control - With Auth Files

Suite auth files (passphrase and SSL certificate) give full control. They are loaded from the suite service directory by the suite server program at start-up, and used to authenticate subsequent client connections. Passphrases are used in a secure encrypted challenge-response scheme, never sent in plain text over the network.

If two users need access to the same suite server program, they must both possess the passphrase file for that suite. Fine-grained access to a single suite server program via distinct user accounts is not currently supported.

Suite server programs automatically install their auth and contact files to job hosts via ssh, to enable task jobs to connect back to the suite server program for task messaging.

Client programs invoked by the suite owner automatically load the passphrase, SSL certificate, and contact file too, for automatic connection to suites.

Manual installation of suite auth files is only needed for remote control, if you do not have a shared filesystem - see below.

13.10. GUI-to-Suite Interaction

The gcylc GUI is mainly a network client to retrieve and display suite status information from the suite server program, but it can also invoke file-reading commands to view and graph the suite configuration and so on. This is entirely transparent if the GUI is running on the suite host account, but full functionality for remote suites requires either a shared filesystem, or (see Remote Control) auth file installation and non-interactive ssh access to the suite host. Without the auth files you will not be able to connect to the suite, and without ssh you will see “permission denied” errors on attempting file access.

13.11. Remote Control

Cylc client programs - command line and GUI - can interact with suite server programs running on other accounts or hosts. How this works depends on whether or not you have:

  • a shared filesystem such that you see the same home directory on both hosts.
  • non-interactive ssh from the client account to the server account.

With a shared filesystem, a suite registered on the remote (server) host is also - in effect - registered on the local (client) host. In this case you can invoke client commands without the --host option; the client will automatically read the host and port from the contact file in the suite service directory.

To control suite server programs running under other user accounts or on other hosts without a shared filesystem, the suite SSL certificate and passphrase must be installed under your $HOME/.cylc/ directory:

$HOME/.cylc/auth/OWNER@HOST/SUITE/
      ssl.cert
      passphrase
      contact  # (optional - see below)

where OWNER@HOST is the suite host account and SUITE is the suite name. Client commands should then be invoked with the --user and --host options, e.g.:

$ cylc gui --user=OWNER --host=HOST SUITE

Note

Remote suite auth files do not need to be installed for read-only access - see Public Access - No Auth Files - via the GUI or monitor.

The suite contact file (see The Suite Contact File) is not needed if you have read-access to the remote suite run directory via the local filesystem or non-interactive ssh to the suite host account - client commands will automatically read it. If you do install the contact file in your auth directory note that the port number will need to be updated if the suite gets restarted on a different port. Otherwise use cylc scan to determine the suite port number and use the --port client command option.

Warning

Possession of a suite passphrase gives full control over the target suite, including edit run functionality - which lets you run arbitrary scripting on job hosts as the suite owner. Further, non-interactive ssh gives full access to the target user account, so we recommended that this is only used to interact with suites running on accounts to which you already have full access.

13.12. Scan And Gscan

Both cylc scan and the cylc gscan GUI can display suites owned by other users on other hosts, including task state totals if the public access level permits that (see Public Access - No Auth Files). Clicking on a remote suite in gscan will open a cylc gui to connect to that suite. This will give you full control, if you have the suite auth files installed; or it will display full read only information if the public access level allows that.

13.13. Task States Explained

As a suite runs, its task proxies may pass through the following states:

  • waiting - still waiting for prerequisites (e.g. dependence on other tasks, and clock triggers) to be satisfied.
  • held - will not be submitted to run even if all prerequisites are satisfied, until released/un-held.
  • queued - ready to run (prerequisites satisfied) but temporarily held back by an internal cylc queue (see Limiting Activity With Internal Queues).
  • ready - ready to run (prerequisites satisfied) and handed to cylc’s job submission sub-system.
  • submitted - submitted to run, but not executing yet (could be waiting in an external batch scheduler queue).
  • submit-failed - job submission failed or submitted job killed (cancelled) before commencing execution.
  • submit-retrying - job submission failed, but a submission retry was configured. Will only enter the submit-failed state if all configured submission retries are exhausted.
  • running - currently executing (a task started message was received, or the task polled as running).
  • succeeded - finished executing successfully (a task succeeded message was received, or the task polled as succeeded).
  • failed - aborted execution due to some error condition (a task failed message was received, or the task polled as failed).
  • retrying - job execution failed, but an execution retry was configured. Will only enter the failed state if all configured execution retries are exhausted.
  • runahead - will not have prerequisites checked (and so automatically held, in effect) until the rest of the suite catches up sufficiently. The amount of runahead allowed is configurable - see Runahead Limiting.
  • expired - will not be submitted to run, due to falling too far behind the wall-clock relative to its cycle point - see Clock-Expire Triggers.

13.14. What The Suite Control GUI Shows

The GUI Text-tree and Dot Views display the state of every task proxy present in the task pool. Once a task has succeeded and Cylc has determined that it can no longer be needed to satisfy the prerequisites of other tasks, its proxy will be cleaned up (removed from the pool) and it will disappear from the GUI. To rerun a task that has disappeared from the pool, you need to re-insert its task proxy and then re-trigger it.

The Graph View is slightly different: it displays the complete dependency graph over the range of cycle points currently present in the task pool. This often includes some greyed-out base or ghost nodes that are empty - i.e. there are no corresponding task proxies currently present in the pool. Base nodes just flesh out the graph structure. Groups of them may be cut out and replaced by single scissor nodes in sections of the graph that are currently inactive.

13.15. Network Connection Timeouts

A connection timeout can be set in site and user global config files (see Global (Site, User) Configuration Files) so that messaging commands cannot hang indefinitely if the suite is not responding (this can be caused by suspending a suite with Ctrl-Z) thereby preventing the task from completing. The same can be done on the command line for other suite-connecting user commands, with the --comms-timeout option.

13.16. Runahead Limiting

Runahead limiting prevents the fastest tasks in a suite from getting too far ahead of the slowest ones. Newly spawned tasks are released to the task pool only when they fall below the runahead limit. A low runhead limit can prevent cylc from interleaving cycles, but it will not stall a suite unless it fails to extend out past a future trigger (see Inter-Cycle Triggers). A high runahead limit may allow fast tasks that are not constrained by dependencies or clock-triggers to spawn far ahead of the pack, which could have performance implications for the suite server program when running very large suites. Succeeded and failed tasks are ignored when computing the runahead limit.

The preferred runahead limiting mechanism restricts the number of consecutive active cycle points. The default value is three active cycle points; see [scheduling] -> max active cycle points. Alternatively the interval between the slowest and fastest tasks can be specified as hard limit; see [scheduling] -> runahead limit.

13.17. Limiting Activity With Internal Queues

Large suites can potentially overwhelm task hosts by submitting too many tasks at once. You can prevent this with internal queues, which limit the number of tasks that can be active (submitted or running) at the same time.

Internal queues behave in the first-in-first-out (FIFO) manner, i.e. tasks are released from a queue in the same order that they were queued.

A queue is defined by a name; a limit, which is the maximum number of active tasks allowed for the queue; and a list of members, assigned by task or family name.

Queue configuration is done under the [scheduling] section of the suite.rc file (like dependencies, internal queues constrain when a task runs).

By default every task is assigned to the default queue, which by default has a zero limit (interpreted by cylc as no limit). To use a single queue for the whole suite just set the default queue limit:

[scheduling]
    [[ queues]]
        # limit the entire suite to 5 active tasks at once
        [[[default]]]
            limit = 5

To use additional queues just name each one, set their limits, and assign members:

[scheduling]
    [[ queues]]
        [[[q_foo]]]
            limit = 5
            members = foo, bar, baz

Any tasks not assigned to a particular queue will remain in the default queue. The queues example suite illustrates how queues work by running two task trees side by side (as seen in the graph GUI) each limited to 2 and 3 tasks respectively:

[meta]
    title = demonstrates internal queueing
    description = """
Two trees of tasks: the first uses the default queue set to a limit of
two active tasks at once; the second uses another queue limited to three
active tasks at once. Run via the graph control GUI for a clear view.
              """
[scheduling]
    [[queues]]
        [[[default]]]
            limit = 2
        [[[foo]]]
            limit = 3
            members = n, o, p, FAM2, u, v, w, x, y, z
    [[dependencies]]
        graph = """
            a => b & c => FAM1
            n => o & p => FAM2
            FAM1:succeed-all => h & i & j & k & l & m
            FAM2:succeed-all => u & v & w & x & y & z
                """
[runtime]
    [[FAM1, FAM2]]
    [[d,e,f,g]]
        inherit = FAM1
    [[q,r,s,t]]
        inherit = FAM2

13.18. Automatic Task Retry On Failure

See also [runtime] -> [[__NAME__]] -> [[[job]]] -> execution retry delays.

Tasks can be configured with a list of “retry delay” intervals, as ISO 8601 durations. If the task job fails it will go into the retrying state and resubmit after the next configured delay interval. An example is shown in the suite listed below under Task Event Handling.

If a task with configured retries is killed (by cylc kill or via the GUI) it goes to the held state so that the operator can decide whether to release it and continue the retry sequence or to abort the retry sequence by manually resetting it to the failed state.

13.19. Task Event Handling

See also [cylc] -> [[events]] and [runtime] -> [[__NAME__]] -> [[[events]]].

Cylc can call nominated event handlers - to do whatever you like - when certain suite or task events occur. This facilitates centralized alerting and automated handling of critical events. Event handlers can be used to send a message, call a pager, or whatever; they can even intervene in the operation of their own suite using cylc commands.

To send an email, use the built-in setting [[[events]]]mail events to specify a list of events for which notifications should be sent. (The name of a registered task output can also be used as an event name in this case.) E.g. to send an email on (submission) failed and retry:

[runtime]
    [[foo]]
        script = """
            test ${CYLC_TASK_TRY_NUMBER} -eq 3
            cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" 'oopsy daisy'
        """
        [[[events]]]
            mail events = submission failed, submission retry, failed, retry, oops
        [[[job]]]
            execution retry delays = PT0S, PT30S
        [[[outputs]]]
            oops = oopsy daisy

By default, the emails will be sent to the current user with:

  • to: set as $USER
  • from: set as notifications@$(hostname)
  • SMTP server at localhost:25

These can be configured using the settings:

  • [[[events]]]mail to (list of email addresses),
  • [[[events]]]mail from
  • [[[events]]]mail smtp.

By default, a cylc suite will send you no more than one task event email every 5 minutes - this is to prevent your inbox from being flooded by emails should a large group of tasks all fail at similar time. See [cylc] -> task event mail interval for details.

Event handlers can be located in the suite bin/ directory; otherwise it is up to you to ensure their location is in $PATH (in the shell in which the suite server program runs). They should require little resource and return quickly - see Managing External Command Execution.

Task event handlers can be specified using the [[[events]]]<event> handler settings, where <event> is one of:

  • ‘submitted’ - the job submit command was successful
  • ‘submission failed’ - the job submit command failed
  • ‘submission timeout’ - task job submission timed out
  • ‘submission retry’ - task job submission failed, but will retry after a configured delay
  • ‘started’ - the task reported commencement of execution
  • ‘succeeded’ - the task reported successful completion
  • ‘warning’ - the task reported a WARNING severity message
  • ‘critical’ - the task reported a CRITICAL severity message
  • ‘custom’ - the task reported a CUSTOM severity message
  • ‘late’ - the task is never active and is late
  • ‘failed’ - the task failed
  • ‘retry’ - the task failed but will retry after a configured delay
  • ‘execution timeout’ - task execution timed out

The value of each setting should be a list of command lines or command line templates (see below).

Alternatively you can use [[[events]]]handlers and [[[events]]]handler events, where the former is a list of command lines or command line templates (see below) and the latter is a list of events for which these commands should be invoked. (The name of a registered task output can also be used as an event name in this case.)

Event handler arguments can be constructed from various templates representing suite name; task ID, name, cycle point, message, and submit number name; and any suite or task [meta] item. See [cylc] -> [[events]] and [runtime] -> [[__NAME__]] -> [[[events]]] for options.

If no template arguments are supplied the following default command line will be used:

<task-event-handler> %(event)s %(suite)s %(id)s %(message)s

Note

Substitution patterns should not be quoted in the template strings. This is done automatically where required.

For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation.

The retry event occurs if a task fails and has any remaining retries configured (see Automatic Task Retry On Failure). The event handler will be called as soon as the task fails, not after the retry delay period when it is resubmitted.

Note

Event handlers are called by the suite server program, not by task jobs. If you wish to pass additional information to them use [cylc] -> [[environment]], not task runtime environment.

The following two suite.rc snippets are examples on how to specify event handlers using the alternate methods:

[runtime]
    [[foo]]
        script = test ${CYLC_TASK_TRY_NUMBER} -eq 2
        [[[events]]]
            retry handler = "echo '!!!!!EVENT!!!!!' "
            failed handler = "echo '!!!!!EVENT!!!!!' "
        [[[job]]]
            execution retry delays = PT0S, PT30S
[runtime]
    [[foo]]
        script = """
            test ${CYLC_TASK_TRY_NUMBER} -eq 2
            cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" 'oopsy daisy'
        """
        [[[events]]]
            handlers = "echo '!!!!!EVENT!!!!!' "
            # Note: task output name can be used as an event in this method
            handler events = retry, failed, oops
        [[[job]]]
            execution retry delays = PT0S, PT30S
        [[[outputs]]]
            oops = oopsy daisy

The handler command here - specified with no arguments - is called with the default arguments, like this:

echo '!!!!!EVENT!!!!!' %(event)s %(suite)s %(id)s %(message)s

13.19.1. Late Events

You may want to be notified when certain tasks are running late in a real time production system - i.e. when they have not triggered by the usual time. Tasks of primary interest are not normally clock-triggered however, so their trigger times are mostly a function of how the suite runs in its environment, and even external factors such as contention with other suites [3] .

But if your system is reasonably stable from one cycle to the next such that a given task has consistently triggered by some interval beyond its cycle point, you can configure Cylc to emit a late event if it has not triggered by that time. For example, if a task forecast normally triggers by 30 minutes after its cycle point, configure late notification for it like this:

[runtime]
   [[forecast]]
        script = run-model.sh
        [[[events]]]
            late offset = PT30M
            late handler = my-handler %(message)s

Late offset intervals are not computed automatically so be careful to update them after any change that affects triggering times.

Note

Cylc can only check for lateness in tasks that it is currently aware of. If a suite gets delayed over many cycles the next tasks coming up can be identified as late immediately, and subsequent tasks can be identified as late as the suite progresses to subsequent cycle points, until it catches up to the clock.

13.20. Managing External Command Execution

Job submission commands, event handlers, and job poll and kill commands, are executed by the suite server program in a “pool” of asynchronous subprocesses, in order to avoid holding the suite up. The process pool is actively managed to limit it to a configurable size (process pool size). Custom event handlers should be light-weight and quick-running because they will tie up a process pool member until they complete, and the suite will appear to stall if the pool is saturated with long-running processes. Processes are killed after a configurable timeout (process pool timeout) however, to guard against rogue commands that hang indefinitely. All process kills are logged by the suite server program. For killed job submissions the associated tasks also go to the submit-failed state.

13.21. Handling Job Preemption

Some HPC facilities allow job preemption: the resource manager can kill or suspend running low priority jobs in order to make way for high priority jobs. The preempted jobs may then be automatically restarted by the resource manager, from the same point (if suspended) or requeued to run again from the start (if killed).

Suspended jobs will poll as still running (their job status file says they started running, and they still appear in the resource manager queue). Loadleveler jobs that are preempted by kill-and-requeue (“job vacation”) are automatically returned to the submitted state by Cylc. This is possible because Loadleveler sends the SIGUSR1 signal before SIGKILL for preemption. Other batch schedulers just send SIGTERM before SIGKILL as normal, so Cylc cannot distinguish a preemption job kill from a normal job kill. After this the job will poll as failed (correctly, because it was killed, and the job status file records that). To handle this kind of preemption automatically you could use a task failed or retry event handler that queries the batch scheduler queue (after an appropriate delay if necessary) and then, if the job has been requeued, uses cylc reset to reset the task to the submitted state.

13.22. Manual Task Triggering and Edit-Run

Any task proxy currently present in the suite can be manually triggered at any time using the cylc trigger command, or from the right-click task menu in gcylc. If the task belongs to a limited internal queue (see Limiting Activity With Internal Queues), this will queue it; if not, or if it is already queued, it will submit immediately.

With cylc trigger --edit (also in the gcylc right-click task menu) you can edit the generated task job script to make one-off changes before the task submits.

13.23. Cylc Broadcast

The cylc broadcast command overrides [runtime] settings in a running suite. This can be used to communicate information to downstream tasks by broadcasting environment variables (communication of information from one task to another normally takes place via the filesystem, i.e. the input/output file relationships embodied in inter-task dependencies). Variables (and any other runtime settings) may be broadcast to all subsequent tasks, or targeted specifically at a specific task, all subsequent tasks with a given name, or all tasks with a given cycle point; see broadcast command help for details.

Broadcast settings targeted at a specific task ID or cycle point expire and are forgotten as the suite moves on. Un-targeted variables and those targeted at a task name persist throughout the suite run, even across restarts, unless manually cleared using the broadcast command - and so should be used sparingly.

13.24. The Meaning And Use Of Initial Cycle Point

When a suite is started with the cylc run command (cold or warm start) the cycle point at which it starts can be given on the command line or hardwired into the suite.rc file:

cylc run foo 20120808T06Z

or:

[scheduling]
    initial cycle point = 20100808T06Z

An initial cycle given on the command line will override one in the suite.rc file.

13.24.1. The Environment Variable CYLC_SUITE_INITIAL_CYCLE_POINT

In the case of a cold start only the initial cycle point is passed through to task execution environments as $CYLC_SUITE_INITIAL_CYCLE_POINT. The value is then stored in suite database files and persists across restarts, but it does get wiped out (set to None) after a warm start, because a warm start is really an implicit restart in which all state information is lost (except that the previous cycle is assumed to have completed).

The $CYLC_SUITE_INITIAL_CYCLE_POINT variable allows tasks to determine if they are running in the initial cold-start cycle point, when different behaviour may be required, or in a normal mid-run cycle point. Note however that an initial R1 graph section is now the preferred way to get different behaviour at suite start-up.

13.25. Simulating Suite Behaviour

Several suite run modes allow you to simulate suite behaviour quickly without running the suite’s real jobs - which may be long-running and resource-hungry:

  • dummy mode - runs dummy tasks as background jobs on configured job hosts.
    • simulates scheduling, job host connectivity, and generates all job files on suite and job hosts.
  • dummy-local mode - runs real dummy tasks as background jobs on the suite host, which allows dummy-running suites from other sites.
    • simulates scheduling and generates all job files on the suite host.
  • simulation mode - does not run any real tasks.
    • simulates scheduling without generating any job files.

Set the run mode (default live) in the GUI suite start dialog box, or on the command line:

$ cylc run --mode=dummy SUITE
$ cylc restart --mode=dummy SUITE

You can get specified tasks to fail in these modes, for more flexible suite testing. See [runtime] -> [[__NAME__]] -> [[[simulation]]] for simulation configuration.

13.25.1. Proportional Simulated Run Length

If task [job]execution time limit is set, Cylc divides it by [simulation]speedup factor (default 10.0) to compute simulated task run lengths (default 10 seconds).

13.25.2. Limitations Of Suite Simulation

Dummy mode ignores batch scheduler settings because Cylc does not know which job resource directives (requested memory, number of compute nodes, etc.) would need to be changed for the dummy jobs. If you need to dummy-run jobs on a batch scheduler manually comment out script items and modify directives in your live suite, or else use a custom live mode test suite.

Note

The dummy modes ignore all configured task script items including init-script. If your init-script is required to run even dummy tasks on a job host, note that host environment setup should be done elsewhere - see Configure Site Environment on Job Hosts.

13.25.3. Restarting Suites With A Different Run Mode?

The run mode is recorded in the suite run database files. Cylc will not let you restart a non-live mode suite in live mode, or vice versa. To test a live suite in simulation mode just take a quick copy of it and run the the copy in simulation mode.

13.26. Automated Reference Test Suites

Reference tests are finite-duration suite runs that abort with non-zero exit status if any of the following conditions occur (by default):

  • cylc fails
  • any task fails
  • the suite times out (e.g. a task dies without reporting failure)
  • a nominated shutdown event handler exits with error status

The default shutdown event handler for reference tests is cylc hook check-triggering which compares task triggering information (what triggers off what at run time) in the test run suite log to that from an earlier reference run, disregarding the timing and order of events - which can vary according to the external queueing conditions, runahead limit, and so on.

To prepare a reference log for a suite, run it with the --reference-log option, and manually verify the correctness of the reference run.

To reference test a suite, just run it (in dummy mode for the most comprehensive test without running real tasks) with the --reference-test option.

A battery of automated reference tests is used to test cylc before posting a new release version. Reference tests can also be used to check that a cylc upgrade will not break your own complex suites - the triggering check will catch any bug that causes a task to run when it shouldn’t, for instance; even in a dummy mode reference test the full task job script (sans script items) executes on the proper task host by the proper batch system.

Reference tests can be configured with the following settings:

[cylc]
    [[reference test]]
        suite shutdown event handler = cylc check-triggering
        required run mode = dummy
        allow task failures = False
        live mode suite timeout = PT5M
        dummy mode suite timeout = PT2M
        simulation mode suite timeout = PT2M

13.26.1. Roll-your-own Reference Tests

If the default reference test is not sufficient for your needs, firstly note that you can override the default shutdown event handler, and secondly that the --reference-test option is merely a short cut to the following suite.rc settings which can also be set manually if you wish:

[cylc]
    abort if any task fails = True
    [[events]]
        shutdown handler = cylc check-triggering
        timeout = PT5M
        abort if shutdown handler fails = True
        abort on timeout = True

13.27. Triggering Off Of Tasks In Other Suites

Note

Please read External Triggers before using the older inter-suite triggering mechanism described in this section.

The cylc suite-state command interrogates suite run databases. It has a polling mode that waits for a given task in the target suite to achieve a given state, or receive a given message. This can be used to make task scripting wait for a remote task to succeed (for example).

Automatic suite-state polling tasks can be defined with in the graph. They get automatically-generated task scripting that uses cylc suite-state appropriately (it is an error to give your own script item for these tasks).

Here’s how to trigger a task bar off a task foo in a remote suite called other.suite:

[scheduling]
    [[dependencies]]
        [[[T00, T12]]]
            graph = "my-foo<other.suite::foo> => bar"

Local task my-foo will poll for the success of foo in suite other.suite, at the same cycle point, succeeding only when or if it succeeds. Other task states can also be polled:

graph = "my-foo<other.suite::foo:fail> => bar"

The default polling parameters (e.g. maximum number of polls and the interval between them) are printed by cylc suite-state --help and can be configured if necessary under the local polling task runtime section:

[scheduling]
    [[ dependencies]]
        [[[T00,T12]]]
            graph = "my-foo<other.suite::foo> => bar"
[runtime]
    [[my-foo]]
        [[[suite state polling]]]
            max-polls = 100
            interval = PT10S

To poll for the target task to receive a message rather than achieve a state, give the message in the runtime configuration (in which case the task status inferred from the graph syntax will be ignored):

[runtime]
    [[my-foo]]
        [[[suite state polling]]]
            message = "the quick brown fox"

For suites owned by others, or those with run databases in non-standard locations, use the --run-dir option, or in-suite:

[runtime]
    [[my-foo]]
        [[[suite state polling]]]
            run-dir = /path/to/top/level/cylc/run-directory

If the remote task has a different cycling sequence, just arrange for the local polling task to be on the same sequence as the remote task that it represents. For instance, if local task cat cycles 6-hourly at 0,6,12,18 but needs to trigger off a remote task dog at 3,9,15,21:

[scheduling]
    [[dependencies]]
        [[[T03,T09,T15,T21]]]
            graph = "my-dog<other.suite::dog>"
        [[[T00,T06,T12,T18]]]
            graph = "my-dog[-PT3H] => cat"

For suite-state polling, the cycle point is automatically converted to the cycle point format of the target suite.

The remote suite does not have to be running when polling commences because the command interrogates the suite run database, not the suite server program.

Note

The graph syntax for suite polling tasks cannot be combined with cycle point offsets, family triggers, or parameterized task notation. This does not present a problem because suite polling tasks can be put on the same cycling sequence as the remote-suite target task (as recommended above), and there is no point in having multiple tasks (family members or parameterized tasks) performing the same polling operation. Task state triggers can be used with suite polling, e.g. to trigger another task if polling fails after 10 tries at 10 second intervals:

[scheduling]
    [[dependencies]]
        graph = "poller<other-suite::foo:succeed>:fail => another-task"
[runtime]
    [[my-foo]]
        [[[suite state polling]]]
            max-polls = 10
            interval = PT10S

13.28. Suite Server Logs

Each suite maintains its own log of time-stamped events under the suite server log directory:

$HOME/cylc-run/SUITE-NAME/log/suite/

By way of example, we will show the complete server log generated (at cylc-7.2.0) by a small suite that runs two 30-second dummy tasks foo and bar for a single cycle point 2017-01-01T00Z before shutting down:

[cylc]
    cycle point format = %Y-%m-%dT%HZ
[scheduling]
    initial cycle point = 2017-01-01T00Z
    final cycle point = 2017-01-01T00Z
    [[dependencies]]
        graph = "foo => bar"
[runtime]
    [[foo]]
        script = sleep 30; /bin/false
    [[bar]]
        script = sleep 30; /bin/true

By the task scripting defined above, this suite will stall when foo fails. Then, the suite owner vagrant@cylon manually resets the failed task’s state to succeeded, allowing bar to trigger and the suite to finish and shut down. Here’s the complete suite log for this run:

$ cylc cat-log SUITE-NAME
2017-03-30T09:46:10Z INFO - Suite starting: server=localhost:43086 pid=3483
2017-03-30T09:46:10Z INFO - Run mode: live
2017-03-30T09:46:10Z INFO - Initial point: 2017-01-01T00Z
2017-03-30T09:46:10Z INFO - Final point: 2017-01-01T00Z
2017-03-30T09:46:10Z INFO - Cold Start 2017-01-01T00Z
2017-03-30T09:46:11Z INFO - [foo.2017-01-01T00Z] -submit_method_id=3507
2017-03-30T09:46:11Z INFO - [foo.2017-01-01T00Z] -submission succeeded
2017-03-30T09:46:11Z INFO - [foo.2017-01-01T00Z] status=submitted: (received)started at 2017-03-30T09:46:10Z for job(01)
2017-03-30T09:46:41Z CRITICAL - [foo.2017-01-01T00Z] status=running: (received)failed/EXIT at 2017-03-30T09:46:40Z for job(01)
2017-03-30T09:46:42Z WARNING - suite stalled
2017-03-30T09:46:42Z WARNING - Unmet prerequisites for bar.2017-01-01T00Z:
2017-03-30T09:46:42Z WARNING -  * foo.2017-01-01T00Z succeeded
2017-03-30T09:47:58Z INFO - [client-command] reset_task_states vagrant@cylon:cylc-reset 1e0d8e9f-2833-4dc9-a0c8-9cf263c4c8c3
2017-03-30T09:47:58Z INFO - [foo.2017-01-01T00Z] -resetting state to succeeded
2017-03-30T09:47:58Z INFO - Command succeeded: reset_task_states([u'foo.2017'], state=succeeded)
2017-03-30T09:47:59Z INFO - [bar.2017-01-01T00Z] -submit_method_id=3565
2017-03-30T09:47:59Z INFO - [bar.2017-01-01T00Z] -submission succeeded
2017-03-30T09:47:59Z INFO - [bar.2017-01-01T00Z] status=submitted: (received)started at 2017-03-30T09:47:58Z for job(01)
2017-03-30T09:48:29Z INFO - [bar.2017-01-01T00Z] status=running: (received)succeeded at 2017-03-30T09:48:28Z for job(01)
2017-03-30T09:48:30Z INFO - Waiting for the command process pool to empty for shutdown
2017-03-30T09:48:30Z INFO - Suite shutting down - AUTOMATIC

The information logged here includes:

  • event timestamps, at the start of each line
  • suite server host, port and process ID
  • suite initial and final cycle points
  • suite start type (cold start in this case)
  • task events (task started, succeeded, failed, etc.)
  • suite stalled warning (in this suite nothing else can run when foo fails)
  • the client command issued by vagrant@cylon to reset foo to {em succeeded}
  • job IDs - in this case process IDs for background jobs (or PBS job IDs etc.)
  • state changes due to incoming task progress message (“started at …” etc.) suite shutdown time and reasons (AUTOMATIC means “all tasks finished and nothing else to do”)

Note

Suite log files are primarily intended for human eyes. If you need to have an external system to monitor suite events automatically, interrogate the sqlite suite run database (see Suite Run Databases) rather than parse the log files.

13.29. Suite Run Databases

Suite server programs maintain two sqlite databases to record restart checkpoints and various other aspects of run history:

$HOME/cylc-run/SUITE-NAME/log/db  # public suite DB
$HOME/cylc-run/SUITE-NAME/.service/db  # private suite DB

The private DB is for use only by the suite server program. The identical public DB is provided for use by external commands such as cylc suite-state, cylc ls-checkpoints, and cylc report-timings. If the public DB gets locked for too long by an external reader, the suite server program will eventually delete it and replace it with a new copy of the private DB, to ensure that both correctly reflect the suite state.

You can interrogate the public DB with the sqlite3 command line tool, the sqlite3 module in the Python standard library, or any other sqlite interface.

$ sqlite3 ~/cylc-run/foo/log/db << _END_
> .headers on
> select * from task_events where name is "foo";
> _END_
name|cycle|time|submit_num|event|message
foo|1|2017-03-12T11:06:09Z|1|submitted|
foo|1|2017-03-12T11:06:09Z|1|output completed|started
foo|1|2017-03-12T11:06:09Z|1|started|
foo|1|2017-03-12T11:06:19Z|1|output completed|succeeded
foo|1|2017-03-12T11:06:19Z|1|succeeded|

13.30. Disaster Recovery

If a suite run directory gets deleted or corrupted, the options for recovery are:

  • restore the run directory from back-up, and restart the suite
  • re-install from source, and warm start from the beginning of the current cycle point

A warm start (see Warm Start) does not need a suite state checkpoint, but it wipes out prior run history, and it could re-run a significant number of tasks that had already completed.

To restart the suite, the critical Cylc files that must be restored are:

# On the suite host:
~/cylc-run/SUITE-NAME/
    suite.rc   # live suite configuration (located here in Rose suites)
    log/db  # public suite DB (can just be a copy of the private DB)
    log/rose-suite-run.conf  # (needed to restart a Rose suite)
    .service/db  # private suite DB
    .service/source -> PATH-TO-SUITE-DIR  # symlink to live suite directory

# On job hosts (if no shared filesystem):
~/cylc-run/SUITE-NAME/
    log/job/CYCLE-POINT/TASK-NAME/SUBMIT-NUM/job.status

Note

This discussion does not address restoration of files generated and consumed by task jobs at run time. How suite data is stored and recovered in your environment is a matter of suite and system design.

In short, you can simply restore the suite service directory, the log directory, and the suite.rc file that is the target of the symlink in the service directory. The service and log directories will come with extra files that aren’t strictly needed for a restart, but that doesn’t matter - although depending on your log housekeeping the log/job directory could be huge, so you might want to be selective about that. (Also in a Rose suite, the suite.rc file does not need to be restored if you restart with rose suite-run - which re-installs suite source files to the run directory).

The public DB is not strictly required for a restart - the suite server program will recreate it if need be - but it is required by cylc ls-checkpoints if you need to identify the right restart checkpoint.

The job status files are only needed if the restart suite state checkpoint contains active tasks that need to be polled to determine what happened to them while the suite was down. Without them, polling will fail and those tasks will need to be manually set to the correct state.

Warning

It is not safe to copy or rsync a potentially-active sqlite DB - the copy might end up corrupted. It is best to stop the suite before copying a DB, or else write a back-up utility using the official sqlite backup API.

13.31. Auto Stop-Restart

Cylc has the ability to automatically stop suites running on a particular host and optionally, restart them on a different host. This is useful if a host needs to be taken off-line e.g. for scheduled maintenance.

This functionality is configured via the following site configuration settings:

  • [run hosts][suite servers]auto restart delay
  • [run hosts][suite servers]condemned hosts
  • [run hosts][suite servers]run hosts

The auto stop-restart feature has two modes:

  • [Normal Mode]
    • When a host is added to the condemned hosts list, any suites running on that host will automatically shutdown then restart selecting a new host from run hosts.
    • For safety, before attempting to stop the suite cylc will first wait for any jobs running locally (under background or at) to complete.
    • In order for Cylc to be able to successfully restart suites the ``run hosts`` must all be on a shared filesystem.
  • [Force Mode]
    • If a host is suffixed with an exclamation mark then Cylc will not attempt to automatically restart the suite and any local jobs (running under background or at) will be left running.

For example in the following configuration any suites running on foo will attempt to restart on pub whereas any suites running on bar will stop immediately, making no attempt to restart.

[suite servers]
    run hosts = pub
    condemned hosts = foo, bar!

To prevent large numbers of suites attempting to restart simultaneously the auto restart delay setting defines a period of time in seconds. Suites will wait for a random period of time between zero and auto restart delay seconds before attempting to stop and restart.

Suites that are started up in no-detach mode cannot be auto stop-restart on a different host - as it will still end up attached to the condemned hosts. Therefore, a suite in no-detach mode running on a condemned host will abort with a non-zero return code. The parent process should manually handle the restart of the suite if desired.

See the [suite servers] configuration section ([suite servers]) for more details.

[3]Late notification of clock-triggered tasks is not very useful in any case because they typically do not depend on other tasks, and as such they can often trigger on time even if the suite is delayed to the point that downstream tasks are late due to their dependence on previous-cycle tasks that are delayed.

13.32. Alternate Suite Run Directories

The cylc register command normally creates a suite run directory at the standard location ~/cylc-run/<SUITE-NAME>/. With the --run-dir option it can create the run directory at some other location, with a symlink from ~/cylc-run/<SUITE-NAME> to allow access via the standard file path.

This may be useful for quick-running Sub-Suites that generate large numbers of files - you could put their run directories on fast local disk or RAM disk, for performance and housekeeping reasons.

13.33. Sub-Suites

A single Cylc suite can configure multiple cycling sequences in the graph, but cycles can’t be nested. If you need cycles within cycles - e.g. to iterate over many files generated by each run of a cycling task - current options are:

  • parameterize the sub-cycles
    • this is easy but it makes more tasks-per-cycle, which is the primary determinant of suite size and server program efficiency
  • run a separate cycling suite over the sub-cycle, inside a main-suite task, for each main-suite cycle point - i.e. use sub-suites
    • this is very efficient, but monitoring and run-directory housekeeping may be more difficult because it creates multiple suites and run directories

Sub-suites must be started with --no-detach so that the containing task does not finish until the sub-suite does, and they should be non-cycling or have a final cycle point so they don’t keep on running indefinitely.

Sub-suite names should normally incorporate the main-suite cycle point (use $CYLC_TASK_CYCLE_POINT in the cylc run command line to start the sub-suite), so that successive sub-suites can run concurrently if necessary and do not compete for the same run directory. This will generate a new sub-suite run directory for every main-suite cycle point, so you may want to put housekeeping tasks in the main suite to extract the useful products from each sub-suite run and then delete the sub-suite run directory.

For quick-running sub-suites that generate large numbers of files, consider using Alternate Suite Run Directories for better performance and easier housekeeping.

14. Suite Storage, Discovery, Revision Control, and Deployment

Small groups of cylc users can of course share suites by manual copying, and generic revision control tools can be used on cylc suites as for any collection of files. Beyond this cylc does not have a built-in solution for suite storage and discovery, revision control, and deployment, on a network. That is not cylc’s core purpose, and large sites may have preferred revision control systems and suite meta-data requirements that are difficult to anticipate. We can, however, recommend the use of Rose to do all of this very easily and elegantly with cylc suites.

14.1. Rose

Rose is a framework for managing and running suites of scientific applications, developed at the Met Office for use with cylc. It is available under the open source GPL license.

15. Appendices

15.1. Suite.rc Reference

This appendix defines all legal suite configuration items. Embedded Jinja2 code (see Jinja2) must process to a valid raw suite.rc file. See also Suite.rc File Overview for a descriptive overview of suite.rc files, including syntax (Syntax).

15.1.1. Top Level Items

The only top level configuration items at present are the suite title and description.

15.1.2. [meta]

Section containing metadata items for this suite. Several items (title, description, URL) are pre-defined and are used by the GUI. Others can be user-defined and passed to suite event handlers to be interpreted according to your needs. For example, the value of a “suite-priority” item could determine how an event handler responds to failure events.

15.1.2.1. [meta] -> title

A single line description of the suite. It is displayed in the GUI “Open Another Suite” window and can be retrieved at run time with the cylc show command.

  • type: single line string
  • default: (none)
15.1.2.2. [meta] -> description

A multi-line description of the suite. It can be retrieved at run time with the cylc show command.

  • type: multi-line string
  • default: (none)
15.1.2.3. [meta] -> URL

A web URL to suite documentation. If present it can be browsed with the cylc doc command, or from the gcylc Suite menu. The string template %(suite_name)s will be replaced with the actual suite name. See also [runtime] -> [[__NAME__]] -> [[[meta]]] -> URL.

  • type: string (URL)
  • default: (none)
  • example: http://my-site.com/suites/%(suite_name)s/index.html
15.1.2.4. [meta] -> group

A group name for a suite. In the gscan GUI, suites with the same group name can be collapsed into a single state summary when the “group” column is displayed.

  • type: single line string
  • default: (none)
15.1.2.5. [meta] -> __MANY__

Replace __MANY__ with any user-defined metadata item. These, like title, URL, etc. can be passed to suite event handlers to be interpreted according to your needs. For example, “suite-priority”.

  • type: String or integer

  • default: (none)

  • example:

    [meta]
        suite-priority = high
    

15.1.3. [cylc]

This section is for configuration that is not specifically task-related.

15.1.3.1. [cylc] -> required run mode

If this item is set cylc will abort if the suite is not started in the specified mode. This can be used for demo suites that have to be run in simulation mode, for example, because they have been taken out of their normal operational context; or to prevent accidental submission of expensive real tasks during suite development.

  • type: string
  • legal values: live, dummy, dummy-local, simulation
  • default: None
15.1.3.2. [cylc] -> UTC mode

Cylc runs off the suite host’s system clock by default. This item allows you to run the suite in UTC even if the system clock is set to local time. Clock-trigger tasks will trigger when the current UTC time is equal to their cycle point date-time plus offset; other time values used, reported, or logged by the suite server program will usually also be in UTC. The default for this can be set at the site level (see [cylc] -> UTC mode).

  • type: boolean
  • default: False, unless overridden at site level.
15.1.3.3. [cylc] -> cycle point format

To just alter the timezone used in the date-time cycle point format, see [cylc] -> cycle point time zone. To just alter the number of expanded year digits (for years below 0 or above 9999), see [cylc] -> cycle point num expanded year digits.

Cylc usually uses a CCYYMMDDThhmmZ (Z in the special case of UTC) or CCYYMMDDThhmm+hhmm format (+ standing for + or - here) for writing down date-time cycle points, which follows one of the basic formats outlined in the ISO 8601 standard. For example, a cycle point on the 3rd of February 2001 at 4:50 in the morning, UTC (+0000 timezone), would be written 20010203T0450Z. Similarly, for the 3rd of February 2001 at 4:50 in the morning, +1300 timezone, cylc would write 20010203T0450+1300.

You may use the isodatetime library’s syntax to write dates and times in ISO 8601 formats - CC for century, YY for decade and decadal year, +X for expanded year digits and their positive or negative sign, thereafter following the ISO 8601 standard example notation except for fractional digits, which are represented as ,ii for hh, ,nn for mm, etc. For example, to write date-times as week dates with fractional hours, set cycle point format to CCYYWwwDThh,iiZ e.g. 1987W041T08,5Z for 08:30 UTC on Monday on the fourth ISO week of 1987.

You can also use a subset of the strptime/strftime POSIX standard - supported tokens are %F, %H, %M, %S, %Y, %d, %j, %m, %s, %z.

The ISO8601 extended date-time format can be used (%Y-%m-%dT%H:%M) but note that the “-” and “:” characters end up in job log directory paths.

The pre cylc-6 legacy 10-digit date-time format YYYYMMDDHH is not ISO8601 compliant and can no longer be used as the cycle point format. For job scripts that still require the old format, use the cylc cyclepoint utility to translate the ISO8601 cycle point inside job scripts, e.g.:

[runtime]
    [[root]]
        [[[environment]]]
            CYCLE_TIME = $(cylc cyclepoint --template=%Y%m%d%H)
15.1.3.4. [cylc] -> cycle point num expanded year digits

For years below 0 or above 9999, the ISO 8601 standard specifies that an extra number of year digits and a sign should be used. This extra number needs to be written down somewhere (here).

For example, if this extra number is set to 2, 00Z on the 1st of January in the year 10040 will be represented as +0100400101T0000Z (2 extra year digits used). With this number set to 3, 06Z on the 4th of May 1985 would be written as +00019850504T0600Z.

This number defaults to 0 (no sign or extra digits used).

15.1.3.5. [cylc] -> cycle point time zone

If you set UTC mode to True ([cylc] -> UTC mode) then this will default to Z. If you use a custom cycle point format ([cylc] -> cycle point format), you should specify the timezone choice (or null timezone choice) here as well.

You may set your own time zone choice here, which will be used for all date-time cycle point dumping. Time zones should be expressed as ISO 8601 time zone offsets from UTC, such as +13, +1300, -0500 or +0645, with Z representing the special +0000 case. Cycle points will be converted to the time zone you give and will be represented with this string at the end.

Cycle points that are input without time zones (e.g. as an initial cycle point setting) will use this time zone if set. If this isn’t set (and UTC mode is also not set), then they will default to the current local time zone.

Note

The ISO standard also allows writing the hour and minute separated by a “:” (e.g. +13:00) - however, this is not recommended, given that the time zone is used as part of task output filenames.

15.1.3.6. [cylc] -> abort if any task fails

Cylc does not normally abort if tasks fail, but if this item is turned on it will abort with exit status 1 if any task fails.

  • type: boolean
  • default: False
15.1.3.7. [cylc] -> health check interval

Specify the time interval on which a running cylc suite will check that its run directory exists and that its contact file contains the expected information. If not, the suite will shut itself down automatically.

  • type: ISO 8601 duration/interval representation (e.g. PT5M, 5 minutes (note: by contrast, P5M means 5 months, so remember the T!)).
  • default: PT10M
15.1.3.8. [cylc] -> task event mail interval

Group together all the task event mail notifications into a single email within a given interval. This is useful to prevent flooding users’ mail boxes when many task events occur within a short period of time.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT5M
15.1.3.9. [cylc] -> disable automatic shutdown

This has the same effect as the --no-auto-shutdown flag for the suite run commands: it prevents the suite server program from shutting down normally when all tasks have finished (a suite timeout can still be used to stop the daemon after a period of inactivity, however). This option can make it easier to re-trigger tasks manually near the end of a suite run, during suite development and debugging.

  • type: boolean
  • default: False
15.1.3.10. [cylc] -> log resolved dependencies

If this is turned on cylc will write the resolved dependencies of each task to the suite log as it becomes ready to run (a list of the IDs of the tasks that actually satisfied its prerequisites at run time). Mainly used for cylc testing and development.

  • type: boolean
  • default: False
15.1.3.11. [cylc] -> [[parameters]]

Define parameter values here for use in expanding parameterized tasks - see Parameterized Tasks.

  • type: list of strings, or an integer range LOWER..UPPER..STEP (two dots, inclusive bounds, “STEP” optional)
  • default: (none)
  • examples: - run = control, test1, test2 - mem = 1..5 (equivalent to 1, 2, 3, 4, 5). - mem = -11..-7..2 (equivalent to -11, -9, -7).
15.1.3.12. [cylc] -> [[parameter templates]]

Parameterized task names (see previous item, and Parameterized Tasks) are expanded, for each parameter value, using string templates. You can assign templates to parameter names here, to override the default templates.

  • type: a Python-style string template
  • default} for integer parameters p: _p%(p)0Nd where N is the number of digits of the maximum integer value, e.g. foo<run> becomes foo_run3 for run value 3.
  • default for non-integer parameters p: _%(p)s e.g. foo<run> becomes foo_top for run value top.
  • example: run = -R%(run)s e.g. foo<run> becomes foo-R3 for run value 3.

Note

The values of a parameter named p are substituted for %(p)s. In _run%(run)s the first “run” is a string literal, and the second gets substituted with each value of the parameter.

15.1.3.13. [cylc] -> [[events]]

Cylc has internal “hooks” to which you can attach handlers that are called by the suite server program whenever certain events occur. This section configures suite event hooks; see [runtime] -> [[__NAME__]] -> [[[events]]] for task event hooks.

Event handler commands can send an email or an SMS, call a pager, intervene in the operation of their own suite, or whatever. They can be held in the suite bin directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which cylc runs, on the suite host). The commands should require very little resource to run and should return quickly.

Each event handler can be specified as a list of command lines or command line templates.

A command line template may have any or all of these patterns which will be substituted with actual values:

  • %(event)s: event name (see below)
  • %(suite)s: suite name
  • %(suite_url)s: suite URL
  • %(suite_uuid)s: suite UUID string
  • %(message)s: event message, if any
  • any suite [meta] item, e.g.: - %(title)s: suite title - %(importance)s: example custom suite metadata

Otherwise the command line will be called with the following default arguments:

<suite-event-handler> %(event)s %(suite)s %(message)s

Note

Substitution patterns should not be quoted in the template strings. This is done automatically where required.

Additional information can be passed to event handlers via [cylc] -> [[environment]].

15.1.3.13.1. [cylc] -> [[events]] -> EVENT handler

A comma-separated list of one or more event handlers to call when one of the following EVENTs occurs:

  • startup - the suite has started running
  • shutdown - the suite is shutting down
  • aborted - the suite is shutting down due to unexpected/unrecoverable error
  • timeout - the suite has timed out
  • stalled - the suite has stalled
  • inactivity - the suite is inactive

Default values for these can be set at the site level via the siterc file (see [cylc] -> [[events]]).

Item details:

  • type: string (event handler script name)
  • default: None, unless defined at the site level.
  • example: startup handler = my-handler.sh
15.1.3.13.2. [cylc] -> [[[events]]] -> handlers

Specify the general event handlers as a list of command lines or command line templates.

  • type: Comma-separated list of strings (event handler command line or command line templates).
  • default: (none)
  • example: handlers = my-handler.sh
15.1.3.13.3. [cylc] -> [[events]] -> handler events

Specify the events for which the general event handlers should be invoked.

  • type: Comma-separated list of events
  • default: (none)
  • example: handler events = timeout, shutdown
15.1.3.13.4. [cylc] -> [[events]] -> mail events

Specify the suite events for which notification emails should be sent.

  • type: Comma-separated list of events
  • default: (none)
  • example: mail events = startup, shutdown, timeout
15.1.3.13.6. [cylc] -> [[events]] -> mail from

Specify an alternate from: email address for suite event notifications.

15.1.3.13.7. [cylc] -> [[events]] -> mail smtp

Specify the SMTP server for sending suite event email notifications.

  • type: string
  • default: None, (localhost:25)
  • example: mail smtp = smtp.yourorg
15.1.3.13.8. [cylc] -> [[events]] -> mail to

A list of email addresses to send suite event notifications. The list can be anything accepted by the mail command.

  • type: string
  • default: None, (USER@HOSTNAME)
  • example: mail to = your.colleague
15.1.3.13.9. [cylc] -> [[events]] -> timeout

If a timeout is set and the timeout event is handled, the timeout event handler(s) will be called if the suite stays in a stalled state for some period of time. The timer is set initially at suite start up. It is possible to set a default for this at the site level (see [cylc] -> [[events]]).

  • type: ISO 8601 duration/interval representation (e.g. PT5S, 5 seconds, PT1S, 1 second) - minimum 0 seconds.
  • default: (none), unless set at the site level.
15.1.3.13.10. [cylc] -> [[events]] -> inactivity

If inactivity is set and the inactivity event is handled, the inactivity event handler(s) will be called if there is no activity in the suite for some period of time. The timer is set initially at suite start up. It is possible to set a default for this at the site level (see [cylc] -> [[events]]).

  • type: ISO 8601 duration/interval representation (e.g. PT5S, 5 seconds, PT1S, 1 second) - minimum 0 seconds.
  • default: (none), unless set at the site level.
15.1.3.13.11. [cylc] -> [[events]] -> abort on stalled

If this is set to True it will cause the suite to abort with error status if it stalls. A suite is considered “stalled” if there are no active, queued or submitting tasks or tasks waiting for clock triggers to be met. It is possible to set a default for this at the site level (see [cylc] -> [[events]]).

  • type: boolean
  • default: False, unless set at the site level.
15.1.3.13.12. [cylc] -> [[events]] -> abort on timeout

If a suite timer is set (above) this will cause the suite to abort with error status if the suite times out while still running. It is possible to set a default for this at the site level (see [cylc] -> [[events]]).

  • type: boolean
  • default: False, unless set at the site level.
15.1.3.13.13. [cylc] -> [[events]] -> abort on inactivity

If a suite inactivity timer is set (above) this will cause the suite to abort with error status if the suite is inactive for some period while still running. It is possible to set a default for this at the site level (see [cylc] -> [[events]]).

  • type: boolean
  • default: False, unless set at the site level.
15.1.3.13.14. [cylc] -> [[events]] -> abort if EVENT handler fails

Cylc does not normally care whether an event handler succeeds or fails, but if this is turned on the EVENT handler will be executed in the foreground (which will block the suite while it is running) and the suite will abort if the handler fails.

  • type: boolean
  • default: False
15.1.3.13.15. [cylc] -> [[environment]]

Environment variables defined in this section are passed to suite and task event handlers.

  • These variables are not passed to tasks - use task runtime variables for that. Similarly, task runtime variables are not available to event handlers - which are executed by the suite server program, (not by running tasks) in response to task events.

  • Cylc-defined environment variables such as $CYLC_SUITE_RUN_DIR are not passed to task event handlers by default, but you can make them available by extracting them to the cylc environment like this:

    [cylc]
        [[environment]]
            CYLC_SUITE_RUN_DIR = $CYLC_SUITE_RUN_DIR
    
  • These variables - unlike task execution environment variables which are written to job scripts and interpreted by the shell at task run time - are not interpreted by the shell prior to use so shell variable expansion expressions cannot be used here.

15.1.3.13.16. [cylc] -> [[environment]] -> __VARIABLE__

Replace __VARIABLE__ with any number of environment variable assignment expressions. Values may refer to other local environment variables (order of definition is preserved) and are not evaluated or manipulated by cylc, so any variable assignment expression that is legal in the shell in which cylc is running can be used (but see the warning above on variable expansions, which will not be evaluated). White space around the = is allowed (as far as cylc’s file parser is concerned these are just suite configuration items).

  • type: string
  • default: (none)
  • examples: FOO = $HOME/foo
15.1.3.13.17. [cylc] -> [[reference test]]

Reference tests are finite-duration suite runs that abort with non-zero exit status if cylc fails, if any task fails, if the suite times out, or if a shutdown event handler that (by default) compares the test run with a reference run reports failure. See Automated Reference Test Suites.

15.1.3.13.18. [cylc] -> [[reference test]] -> suite shutdown event handler

A shutdown event handler that should compare the test run with the reference run, exiting with zero exit status only if the test run verifies.

  • type: string (event handler command name or path)
  • default: cylc hook check-triggering

As for any event handler, the full path can be omitted if the script is located somewhere in $PATH or in the suite bin directory.

15.1.3.13.19. [cylc] -> [[reference test]] -> required run mode

If your reference test is only valid for a particular run mode, this setting will cause cylc to abort if a reference test is attempted in another run mode.

  • type: string
  • legal values: live, dummy, dummy-local, simulation
  • default: None
15.1.3.13.20. [cylc] -> [[reference test]] -> allow task failures

A reference test run will abort immediately if any task fails, unless this item is set, or a list of expected task failures is provided (below).

  • type: boolean
  • default: False
15.1.3.13.21. [cylc] -> [[reference test]] -> expected task failures

A reference test run will abort immediately if any task fails, unless allow task failures is set (above) or the failed task is found in a list IDs of tasks that are expected to fail.

  • type: Comma-separated list of strings (task IDs: name.cycle_point).
  • default: (none)
  • example: foo.20120808, bar.20120908
15.1.3.13.22. [cylc] -> [[reference test]] -> live mode suite timeout

The timeout value, expressed as an ISO 8601 duration/interval, after which the test run should be aborted if it has not finished, in live mode. Test runs cannot be done in live mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.

  • type: ISO 8601 duration/interval representation, e.g. PT5M is 5 minutes (note: by contrast P5M means 5 months, so remember the T!).
  • default: PT1M (1 minute)
15.1.3.13.23. [cylc] -> [[reference test]] -> simulation mode suite timeout

The timeout value in minutes after which the test run should be aborted if it has not finished, in simulation mode. Test runs cannot be done in simulation mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.

  • type: ISO 8601 duration/interval representation (e.g. PT5M, 5 minutes (note: by contrast, P5M means 5 months, so remember the T!)).
  • default: PT1M (1 minute)
15.1.3.13.24. [cylc] -> [[reference test]] -> dummy mode suite timeout

The timeout value, expressed as an ISO 8601 duration/interval, after which the test run should be aborted if it has not finished, in dummy mode. Test runs cannot be done in dummy mode unless you define a value for this item, because it is not possible to arrive at a sensible default for all suites.

  • type: ISO 8601 duration/interval representation (e.g. PT5M, 5 minutes (note: by contrast, P5M means 5 months, so remember the T!)).
  • default: PT1M (1 minute)
15.1.3.14. [cylc] -> [[authentication]]

Authentication of client programs with suite server programs can be set in the global site/user config files and overridden here if necessary. See [authentication] for more information.

15.1.3.14.1. [cylc] -> [[authentication]] -> public

The client privilege level granted for public access - i.e. no suite passphrase required. See [authentication] for legal values.

15.1.3.15. [cylc] -> [[simulation]]

Suite-level configuration for the simulation and dummy run modes described in Simulating Suite Behaviour.

15.1.3.15.1. [cylc] -> [[simulation]] -> disable suite event handlers

If this is set to True configured suite event handlers will not be called in simulation or dummy modes.

  • type: boolean
  • default: True

15.1.4. [scheduling]

This section allows cylc to determine when tasks are ready to run.

15.1.4.1. [scheduling] -> cycling mode

Cylc runs using the proleptic Gregorian calendar by default. This item allows you to either run the suite using the 360 day calendar (12 months of 30 days in a year) or using integer cycling. It also supports use of the 365 (never a leap year) and 366 (always a leap year) calendars.

  • type: string
  • legal values: gregorian, 360day, 365day, 366day, integer
  • default: gregorian
15.1.4.2. [scheduling] -> initial cycle point

In a cold start each cycling task (unless specifically excluded under [special tasks]) will be loaded into the suite with this cycle point, or with the closest subsequent valid cycle point for the task. This item can be overridden on the command line or in the gcylc suite start panel.

In date-time cycling, if you do not provide time zone information for this, it will be assumed to be local time, or in UTC if [cylc] -> UTC mode is set, or in the time zone determined by [cylc] -> cycle point time zone if that is set.

  • type: ISO 8601 date-time point representation (e.g. CCYYMMDDThhmm, 19951231T0630) or “now”.
  • default: (none)

The string “now” converts to the current date-time on the suite host (adjusted to UTC if the suite is in UTC mode but the host is not) to minute resolution. Minutes (or hours, etc.) may be ignored depending on your cycle point format ([cylc] -> cycle point format).

15.1.4.3. [scheduling] -> [[initial cycle point]] -> initial cycle point relative to current time

This can be used to set the initial cycle point time relative to the current time.

Two additional commands, next and previous, can be used when setting the initial cycle point.

The syntax uses truncated ISO8601 time representations, and is of the style: next(Thh:mmZ), previous(T-mm); e.g.

  • initial cycle point = next(T15:00Z)
  • initial cycle point = previous(T09:00)
  • initial cycle point = next(T12)
  • initial cycle point = previous(T-20)

Examples of interpretation are given in Table 1.

A list of times, separated by semicolons, can be provided, e.g. next(T-00;T-15;T-30;T-45). At least one time is required within the brackets, and if more than one is given, the major time unit in each (hours or minutes) should all be of the same type.

If an offset from the specified date or time is required, this should be used in the form: previous(Thh:mm) +/- PxTy in the same way as is used for determining cycle periods, e.g.

  • initial cycle point = previous(T06) +P1D
  • initial cycle point = next(T-30) -PT1H

The section in the bracket attached to the next/previous command is interpreted first, and then the offset is applied.

The offset can also be used independently without a next or previous command, and will be interpreted as an offset from “now”.

Table 1 Examples of setting relative initial cycle point for times and offsets using now = 2018-03-14T15:12Z (and UTC mode)
Syntax Interpretation
next(T-00) 2018-03-14T16:00Z
previous(T-00) 2018-03-14T15:00Z
next(T-00; T-15; T-30; T-45) 2018-03-14T15:15Z
previous(T-00; T-15; T-30; T-45) 2018-03-14T15:00Z
next(T00) 2018-03-15T00:00Z
previous(T00) 2018-03-14T00:00Z
next(T06:30Z) 2018-03-15T06:30Z
previous(T06:30) -P1D 2018-03-13T06:30Z
next(T00; T06; T12; T18) 2018-03-14T18:00Z
previous(T00; T06; T12; T18) 2018-03-14T12:00Z
next(T00; T06; T12; T18) +P1W 2018-03-21T18:00Z
PT1H 2018-03-14T16:12Z
-P1M 2018-02-14T15:12Z

The relative initial cycle point also works with truncated dates, including weeks and ordinal date, using ISO8601 truncated date representations. Note that day-of-week should always be specified when using weeks. If a time is not included, the calculation of the next or previous corresponding point will be done from midnight of the current day. Examples of interpretation are given in Table 2.

Table 2 Examples of setting relative initial cycle point for dates using now = 2018-03-14T15:12Z (and UTC mode)
Syntax Interpretation
next(-00) 2100-01-01T00:00Z
previous(--01) 2018-01-01T00:00Z
next(---01) 2018-04-01T00:00Z
previous(--1225) 2017-12-25T00:00Z
next(-2006) 2020-06-01T00:00Z
previous(-W101) 2018-03-05T00:00Z
next(-W-1; -W-3; -W-5) 2018-03-14T00:00Z
next(-001; -091; -181; -271) 2018-04-01T00:00Z
previous(-365T12Z) 2017-12-31T12:00Z
15.1.4.4. [scheduling] -> final cycle point

Cycling tasks are held once they pass the final cycle point, if one is specified. Once all tasks have achieved this state the suite will shut down. If this item is provided you can override it on the command line or in the gcylc suite start panel.

In date-time cycling, if you do not provide time zone information for this, it will be assumed to be local time, or in UTC if [cylc] -> UTC mode is set, or in the [cylc] -> cycle point time zone if that is set.

  • type: ISO 8601 date-time point representation (e.g. CCYYMMDDThhmm, 19951231T1230) or ISO 8601 date-time offset (e.g. +P1D+PT6H)
  • default: (none)
15.1.4.5. [scheduling] -> initial cycle point constraints

In a cycling suite it is possible to restrict the initial cycle point by defining a list of truncated time points under the initial cycle point constraints.

  • type: Comma-separated list of ISO 8601 truncated time point representations (e.g. T00, T06, T-30).
  • default: (none)
15.1.4.6. [scheduling] -> final cycle point constraints

In a cycling suite it is possible to restrict the final cycle point by defining a list of truncated time points under the final cycle point constraints.

  • type: Comma-separated list of ISO 8601 truncated time point representations (e.g. T00, T06, T-30).
  • default: (none)
15.1.4.7. [scheduling] -> hold after point

Cycling tasks are held once they pass the hold after cycle point, if one is specified. Unlike the final cycle point suite will not shut down once all tasks have passed this point. If this item is provided you can override it on the command line or in the gcylc suite start panel.

15.1.4.8. [scheduling] -> runahead limit

Runahead limiting prevents the fastest tasks in a suite from getting too far ahead of the slowest ones, as documented in Runahead Limiting.

This config item specifies a hard limit as a cycle interval between the slowest and fastest tasks. It is deprecated in favour of the newer default limiting by max active cycle points ([scheduling] -> max active cycle points).

  • type: Cycle interval string e.g. PT12H for a 12 hour limit under ISO 8601 cycling.
  • default: (none)
15.1.4.9. [scheduling] -> max active cycle points

Runahead limiting prevents the fastest tasks in a suite from getting too far ahead of the slowest ones, as documented in Runahead Limiting.

This config item supersedes the deprecated hard runahead limit ([scheduling] -> runahead limit). It allows up to N (default 3) consecutive cycle points to be active at any time, adjusted up if necessary for any future triggering.

  • type: integer
  • default: 3
15.1.4.10. [scheduling] -> spawn to max active cycle points

Allows tasks to spawn out to max active cycle points ([scheduling] -> max active cycle points), removing restriction that a task has to have submitted before its successor can be spawned.

Important: This should be used with care given the potential impact of additional task proxies both in terms of memory and cpu for the cylc daemon as well as overheads in rendering all the additional tasks in gcylc. Also, use of the setting may highlight any issues with suite design relying on the default behaviour where downstream tasks would otherwise be waiting on ones upstream submitting and the suite would have stalled e.g. a housekeeping task at a later cycle deleting an earlier cycle’s data before that cycle has had chance to run where previously the task would not have been spawned until its predecessor had been submitted.

  • type: boolean
  • default: False
15.1.4.11. [scheduling] -> [[queues]]

Configuration of internal queues, by which the number of simultaneously active tasks (submitted or running) can be limited, per queue. By default a single queue called default is defined, with all tasks assigned to it and no limit. To use a single queue for the whole suite just set the limit on the default queue as required. See also Limiting Activity With Internal Queues.

15.1.4.11.1. [scheduling] -> [[queues]] -> [[[__QUEUE__]]]

Section heading for configuration of a single queue. Replace __QUEUE__ with a queue name, and repeat the section as required.

  • type: string
  • default: “default”
15.1.4.11.1.1. [scheduling] -> [[queues]] -> [[[__QUEUE__]]] -> limit

The maximum number of active tasks allowed at any one time, for this queue.

  • type: integer
  • default: 0 (i.e. no limit)
15.1.4.11.1.2. [scheduling] -> [[queues]] -> [[[__QUEUE__]]] -> members

A list of member tasks, or task family names, to assign to this queue (assigned tasks will automatically be removed from the default queue).

  • type: Comma-separated list of strings (task or family names).
  • default: none for user-defined queues; all tasks for the “default” queue
15.1.4.12. [scheduling] -> [[xtriggers]]

This section is for External Trigger function declarations - see External Triggers.

15.1.4.12.1. [scheduling] -> [[xtriggers]] -> __MANY__

Replace __MANY__ with any user-defined event trigger function declarations and corresponding labels for use in the graph:

  • type: string: function signature followed by optional call interval
  • example: trig_1 = my_trigger(arg1, arg2, kwarg1, kwarg2):PT10S

(See External Triggers for details).

15.1.4.13. [scheduling] -> [[special tasks]]

This section is used to identify tasks with special behaviour. Family names can be used in special task lists as shorthand for listing all member tasks.

15.1.4.13.1. [scheduling] -> [[special tasks]] -> clock-trigger

Note

Please read External Triggers before using the older clock triggers described in this section.

Clock-trigger tasks (see Clock Triggers) wait on a wall clock time specified as an offset from their own cycle point.

  • type: Comma-separated list of task or family names with associated date-time offsets expressed as ISO8601 interval strings, positive or negative, e.g. PT1H for 1 hour. The offset specification may be omitted to trigger right on the cycle point.

  • default: (none)

  • example:

    clock-trigger = foo(PT1H30M), bar(PT1.5H), baz
    
15.1.4.13.2. [scheduling] -> [[special tasks]] -> clock-expire

Clock-expire tasks enter the expired state and skip job submission if too far behind the wall clock when they become ready to run. The expiry time is specified as an offset from wall-clock time; typically it should be negative - see Clock-Expire Triggers.

  • type: Comma-separated list of task or family names with associated date-time offsets expressed as ISO8601 interval strings, positive or negative, e.g. PT1H for 1 hour. The offset may be omitted if it is zero.

  • default: (none)

  • example:

    clock-expire = foo(-P1D)
    
15.1.4.13.3. [scheduling] -> [[special tasks]] -> external-trigger

Note

Please read External Triggers before using the older mechanism described in this section.

Externally triggered tasks (see Old-Style External Triggers (Deprecated)) wait on external events reported via the cylc ext-trigger command. To constrain triggers to a specific cycle point, include $CYLC_TASK_CYCLE_POINT in the trigger message string and pass the cycle point to the cylc ext-trigger command.

  • type: Comma-separated list of task names with associated external trigger message strings.

  • default: (none)

  • example: (note the comma and line-continuation character)

    external-trigger = get-satx("new sat-X data ready"),
                       get-saty("new sat-Y data ready for $CYLC_TASK_CYCLE_POINT")
    
15.1.4.13.4. [scheduling] -> [[special tasks]] -> sequential

Sequential tasks automatically depend on their own previous-cycle instance. This declaration is deprecated in favour of explicit inter-cycle triggers - see Special Sequential Tasks.

  • type: Comma-separated list of task or family names.
  • default: (none)
  • example: sequential = foo, bar
15.1.4.13.5. [scheduling] -> [[special tasks]] -> exclude at start-up

Any task listed here will be excluded from the initial task pool (this goes for suite restarts too). If an inclusion list is also specified, the initial pool will contain only included tasks that have not been excluded. Excluded tasks can still be inserted at run time. Other tasks may still depend on excluded tasks if they have not been removed from the suite dependency graph, in which case some manual triggering, or insertion of excluded tasks, may be required.

  • type: Comma-separated list of task or family names.
  • default: (none)
15.1.4.13.6. [scheduling] -> [[special tasks]] -> include at start-up

If this list is not empty, any task not listed in it will be excluded from the initial task pool (this goes for suite restarts too). If an exclusion list is also specified, the initial pool will contain only included tasks that have not been excluded. Excluded tasks can still be inserted at run time. Other tasks may still depend on excluded tasks if they have not been removed from the suite dependency graph, in which case some manual triggering, or insertion of excluded tasks, may be required.

  • type: Comma-separated list of task or family names.
  • default: (none)
15.1.4.14. [scheduling] -> [[dependencies]]

The suite dependency graph is defined under this section. You can plot the dependency graph as you work on it, with cylc graph or by right clicking on the suite in the db viewer. See also Scheduling - Dependency Graphs.

15.1.4.14.1. [scheduling] -> [[dependencies]] -> graph

The dependency graph for a completely non-cycling suites can go here. See also [scheduling] -> [[dependencies]] -> [[[__RECURRENCE__]]] -> graph below and Scheduling - Dependency Graphs, for graph string syntax.

15.1.4.14.2. [scheduling] -> [[dependencies]] -> [[[__RECURRENCE__]]]

__RECURRENCE__ section headings define the sequence of cycle points for which the subsequent graph section is valid. These should be specified in our ISO 8601 derived sequence syntax, or similar for integer cycling:

  • examples: - date-time cycling: [[[T00,T06,T12,T18]]] or [[[PT6H]]] - integer cycling (stepped by 2): [[[P2]]]
  • default: (none)

See Graph Types for more on recurrence expressions, and how multiple graph sections combine.

15.1.4.14.2.1. [scheduling] -> [[dependencies]] -> [[[__RECURRENCE__]]] -> graph

The dependency graph for a given recurrence section goes here. Syntax examples follow; see also Scheduling - Dependency Graphs and Task Triggering.

  • type: string

  • examples:

    graph = """
        foo => bar => baz & waz     # baz and waz both trigger off bar
        foo[-P1D-PT6H] => bar       # bar triggers off foo[-P1D-PT6H]
        baz:out1 => faz             # faz triggers off a message output of baz
        X:start => Y                # Y triggers if X starts executing
        X:fail => Y                 # Y triggers if X fails
        foo[-PT6H]:fail => bar      # bar triggers if foo[-PT6H] fails
        X => !Y                     # Y suicides if X succeeds
        X | X:fail => Z             # Z triggers if X succeeds or fails
        X:finish => Z               # Z triggers if X succeeds or fails
        (A | B & C ) | D => foo     # general conditional triggers
        foo:submit => bar           # bar triggers if foo is successfully submitted
        foo:submit-fail => bar      # bar triggers if submission of foo fails
        # comment
    """
    
  • default: (none)

15.1.5. [runtime]

This section is used to specify how, where, and what to execute when tasks are ready to run. Common configuration can be factored out in a multiple-inheritance hierarchy of runtime namespaces that culminates in the tasks of the suite. Order of precedence is determined by the C3 linearization algorithm as used to find the method resolution order in Python language class hierarchies. For details and examples see Runtime - Task Configuration.

15.1.5.1. [runtime] -> [[__NAME__]]

Replace __NAME__ with a namespace name, or a comma-separated list of names, and repeat as needed to define all tasks in the suite. Names must be valid according to the restrictions outlined in Task and Namespace Names.

A namespace represents a group or family of tasks if other namespaces inherit from it, or a task if no others inherit from it.

  • legal values: - [[foo]] - [[foo, bar, baz]]

If multiple names are listed the subsequent settings apply to each.

All namespaces inherit initially from root, which can be explicitly configured to provide or override default settings for all tasks in the suite.

15.1.5.1.1. [runtime] -> [[__NAME__]] -> extra log files

A list of user-defined log files associated with a task. Files defined here will appear alongside the default log files in the cylc GUI. Log files must reside in the job log directory $CYLC_TASK_LOG_DIR and ideally should be named using the $CYLC_TASK_LOG_ROOT prefix (see Task Job Script Variables).

  • type: Comma-separated list of strings (log file names).
  • default: (none)
  • example: (job.custom-log-name)
15.1.5.1.2. [runtime] -> [[__NAME__]] -> inherit

A list of the immediate parent(s) this namespace inherits from. If no parents are listed root is assumed.

  • type: Comma-separated list of strings (parent namespace names).
  • default: root
15.1.5.1.3. [runtime] -> [[__NAME__]] -> init-script

Custom script invoked by the task job script before the task execution environment is configured - so it does not have access to any suite or task environment variables. It can be an external command or script, or inlined scripting. The original intention for this item was to allow remote tasks to source login scripts to configure their access to cylc, but this should no longer be necessary (see Task Job Access To Cylc). See also env-script, pre-script, script, post-script, err-script, exit-script.

  • type: string
  • default: (none)
  • example: init-script = "echo Hello World"
15.1.5.1.4. [runtime] -> [[__NAME__]] -> env-script

Custom script invoked by the task job script between the cylc-defined environment (suite and task identity, etc.) and the user-defined task runtime environment - so it has access to the cylc environment (and the task environment has access to variables defined by this scripting). It can be an external command or script, or inlined scripting. See also init-script, pre-script, script, post-script, err-script, and exit-script.

  • type: string
  • default: (none)
  • example: env-script = "echo Hello World"
15.1.5.1.5. [runtime] -> [[__NAME__]] -> pre-script

Custom script invoked by the task job script immediately before the script item (just below). It can be an external command or script, or inlined scripting. See also init-script, env-script, script, post-script, err-script, and exit-script.

  • type: string

  • default: (none)

  • example:

    pre-script = """
      . $HOME/.profile
      echo Hello from suite ${CYLC_SUITE_NAME}!"""
    
15.1.5.1.6. [runtime] -> [[__NAME__]] -> script

The main custom script invoked from the task job script. It can be an external command or script, or inlined scripting. See also init-script, env-script, pre-script, post-script, err-script, and exit-script.

  • type: string
  • root default: (none)
15.1.5.1.7. [runtime] -> [[__NAME__]] -> post-script

Custom script invoked by the task job script immediately after the script item (just above). It can be an external command or script, or inlined scripting. See also init-script, env-script, pre-script, script, err-script, and exit-script.

  • type: string
  • default: (none)
15.1.5.1.8. [runtime] -> [[__NAME__]] -> err-script

Custom script to be invoked at the end of the error trap, which is triggered due to failure of a command in the task job script or trappable job kill. The output of this will always be sent to STDERR and $1 is set to the name of the signal caught by the error trap. The script should be fast and use very little system resource to ensure that the error trap can return quickly. Companion of exit-script, which is executed on job success. It can be an external command or script, or inlined scripting. See also init-script, env-script, pre-script, script, post-script, and exit-script.

  • type: string
  • default: (none)
  • example: err-script = "printenv FOO"
15.1.5.1.9. [runtime] -> [[__NAME__]] -> exit-script

Custom script invoked at the very end of successful job execution, just before the job script exits. It should execute very quickly. Companion of err-script, which is executed on job failure. It can be an external command or script, or inlined scripting. See also init-script, env-script, pre-script, script, post-script, and err-script.

  • type: string
  • default: (none)
  • example: exit-script = "rm -f $TMP_FILES"
15.1.5.1.10. [runtime] -> [[__NAME__]] -> work sub-directory

Task job scripts are executed from within work directories created automatically under the suite run directory. A task can get its own work directory from $CYLC_TASK_WORK_DIR (or simply $PWD if it does not cd elsewhere at runtime). The default directory path contains task name and cycle point, to provide a unique workspace for every instance of every task. If several tasks need to exchange files and simply read and write from their from current working directory, this item can be used to override the default to make them all use the same workspace.

The top level share and work directory location can be changed (e.g. to a large data area) by a global config setting (see [hosts] -> [[HOST]] -> work directory).

  • type: string (directory path, can contain environment variables)
  • default: $CYLC_TASK_CYCLE_POINT/$CYLC_TASK_NAME
  • example: $CYLC_TASK_CYCLE_POINT/shared/

Note

If you omit cycle point from the work sub-directory path successive instances of the task will share the same workspace. Consider the effect on cycle point offset housekeeping of work directories before doing this.

15.1.5.1.11. [runtime] -> [[__NAME__]] -> [[[meta]]]

Section containing metadata items for this task or family namespace. Several items (title, description, URL) are pre-defined and are used by the GUI. Others can be user-defined and passed to task event handlers to be interpreted according to your needs. For example, the value of an “importance” item could determine how an event handler responds to task failure events.

Any suite meta item can now be passed to task event handlers by prefixing the string template item name with “suite_”, for example:

[runtime]
    [[root]]
        [[[events]]]
            failed handler = send-help.sh %(suite_title)s %(suite_importance)s %(title)s
15.1.5.1.11.1. [runtime] -> [[__NAME__]] -> [[[meta]]] -> title

A single line description of this namespace. It is displayed by the cylc list command and can be retrieved from running tasks with the cylc show command.

  • type: single line string
  • root default: (none)
15.1.5.1.11.2. [runtime] -> [[__NAME__]] -> [[[meta]]] -> description

A multi-line description of this namespace, retrievable from running tasks with the cylc show command.

  • type: multi-line string
  • root default: (none)
15.1.5.1.11.3. [runtime] -> [[__NAME__]] -> [[[meta]]] -> URL

A web URL to task documentation for this suite. If present it can be browsed with the cylc doc command, or by right-clicking on the task in gcylc. The string templates %(suite_name)s and %(task_name)s will be replaced with the actual suite and task names. See also [meta] -> URL.

  • type: string (URL)

  • default: (none)

  • example: you can set URLs to all tasks in a suite by putting something like the following in the root namespace:

    [runtime]
        [[root]]
            [[[meta]]]
                URL = http://my-site.com/suites/%(suite_name)s/%(task_name)s.html
    

Note

URLs containing the comment delimiter # must be protected by quotes.

15.1.5.1.11.4. [runtime] -> [[__NAME__]] -> [[[meta]]] -> __MANY__

Replace __MANY__ with any user-defined metadata item. These, like title, URL, etc. can be passed to task event handlers to be interpreted according to your needs. For example, the value of an “importance” item could determine how an event handler responds to task failure events.

  • type: String or integer

  • default: (none)

  • example:

    [runtime]
        [[root]]
            [[[meta]]]
                importance = high
                color = red
    
15.1.5.1.12. [runtime] -> [[__NAME__]] -> [[[job]]]

This section configures the means by which cylc submits task job scripts to run.

15.1.5.1.12.1. [runtime] -> [[__NAME__]] -> [[[job]]] -> batch system

See Task Job Submission and Management for how job submission works, and how to define new handlers for different batch systems. Cylc has a number of built in batch system handlers:

  • type: string
  • legal values:
    • background - invoke a child process
    • at - the rudimentary Unix at scheduler
    • loadleveler - IBM LoadLeveler llsubmit, with directives defined in the suite.rc file
    • lsf - IBM Platform LSF bsub, with directives defined in the suite.rc file
    • pbs - PBS qsub, with directives defined in the suite.rc file
    • sge - Sun Grid Engine qsub, with directives defined in the suite.rc file
    • slurm - Simple Linux Utility for Resource Management sbatch, with directives defined in the suite.rc file
    • moab - Moab workload manager msub, with directives defined in the suite.rc file
  • default: background
15.1.5.1.12.2. [runtime] -> [[__NAME__]] -> [[[job]]] -> execution time limit

Specify the execution wall clock limit for a job of the task. For background and at, the job script will be invoked using the timeout command. For other batch systems, the specified time will be automatically translated into the equivalent directive for wall clock limit.

Tasks are polled multiple times, where necessary, when they exceed their execution time limits. (See [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> execution time limit polling intervals for how to configure the polling intervals).

  • type: ISO 8601 duration/interval representation
  • example: PT5M, 5 minutes, PT1H, 1 hour
  • default: (none)
15.1.5.1.12.3. [runtime] -> [[__NAME__]] -> [[[job]]] -> batch submit command template

This allows you to override the actual command used by the chosen batch system. The template’s \%(job)s will be substituted by the job file path.

  • type: string
  • legal values: a string template
  • example: llsubmit \%(job)s
15.1.5.1.12.4. [runtime] -> [[__NAME__]] -> [[[job]]] -> shell

Location of the command used to interpret the job script submitted by the suite server program when a task is ready to run. This can be set to the location of bash in the job host if the shell is not installed in the standard location.

Note

It has no bearing on any sub-shells that may be called by the job script.

Setting this to the path of a ksh93 interpreter is deprecated. Support of which will be withdrawn in a future cylc release. Setting this to any other shell is not supported.

  • type: string
  • root default: /bin/bash
15.1.5.1.12.5. [runtime] -> [[__NAME__]] -> [[[job]]] -> submission retry delays

A list of duration (in ISO 8601 syntax), after which to resubmit if job submission fails.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • example: PT1M,3*PT1H, P1D is equivalent to PT1M, PT1H, PT1H, PT1H, P1D - 1 minute, 1 hour, 1 hour, 1 hour, 1 day.
  • default: (none)
15.1.5.1.12.6. [runtime] -> [[__NAME__]] -> [[[job]]] -> execution retry delays

See also Automatic Task Retry On Failure.

A list of ISO 8601 time duration/intervals after which to resubmit the task if it fails. The variable $CYLC_TASK_TRY_NUMBER in the task execution environment is incremented each time, starting from 1 for the first try - this can be used to vary task behaviour by try number.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • example: PT1.5M,3*PT10M is equivalent to PT1.5M, PT10M, PT10M, PT10M - 1.5 minutes, 10 minutes, 10 minutes, 10 minutes.
  • default: (none)
15.1.5.1.12.7. [runtime] -> [[__NAME__]] -> [[[job]]] -> submission polling intervals

A list of intervals, expressed as ISO 8601 duration/intervals, with optional multipliers, after which cylc will poll for status while the task is in the submitted state.

For the polling task communication method this overrides the default submission polling interval in the site/user config files (Global (Site, User) Configuration Files). For default and ssh task communications, polling is not done by default but it can still be configured here as a regular check on the health of submitted tasks.

Each list value is used in turn until the last, which is used repeatedly until finished.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • example: PT1M,3*PT1H, PT1M is equivalent to PT1M, PT1H, PT1H, PT1H, PT1M - 1 minute, 1 hour, 1 hour, 1 hour, 1 minute.
  • default: (none)

A single interval value is probably appropriate for submission polling.

15.1.5.1.12.8. [runtime] -> [[__NAME__]] -> [[[job]]] -> execution polling intervals

A list of intervals, expressed as ISO 8601 duration/intervals, with optional multipliers, after which cylc will poll for status while the task is in the running state.

For the polling task communication method this overrides the default execution polling interval in the site/user config files (Global (Site, User) Configuration Files). For default and ssh task communications, polling is not done by default but it can still be configured here as a regular check on the health of submitted tasks.

Each list value is used in turn until the last, which is used repeatedly until finished.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • example: PT1M,3*PT1H, PT1M is equivalent to PT1M, PT1H, PT1H, PT1H, PT1M - 1 minute, 1 hour, 1 hour, 1 hour, 1 minute.
  • default: (none)
15.1.5.1.13. [runtime] -> [[__NAME__]] -> [[[remote]]]

Configure host and username, for tasks that do not run on the suite host account. Non-interactive ssh is used to submit the task by the configured batch system, so you must distribute your ssh key to allow this. Cylc must be installed on task remote accounts, but no external software dependencies are required there.

15.1.5.1.13.1. [runtime] -> [[__NAME__]] -> [[[remote]]] -> host

The remote host for this namespace. This can be a static hostname, an environment variable that holds a hostname, or a command that prints a hostname to stdout. Host selection commands are executed just prior to job submission. The host (static or dynamic) may have an entry in the cylc site or user config file to specify parameters such as the location of cylc on the remote machine; if not, the corresponding local settings (on the suite host) will be assumed to apply on the remote host.

  • type: string (a valid hostname on the network)
  • default: (none)
  • examples:
    • static host name: host = foo
    • fully qualified: host = foo.bar.baz
    • dynamic host selection:
      • shell command (1): host = $(host-selector.sh)
      • shell command (2): host = \`host-selector.sh\`
      • environment variable: host = $MY_HOST
15.1.5.1.13.2. [runtime] -> [[__NAME__]] -> [[[remote]]] -> owner

The username of the task host account. This is (only) used in the non-interactive ssh command invoked by the suite server program to submit the remote task (consequently it may be defined using local environment variables (i.e. the shell in which cylc runs, and [cylc] -> [[environment]]).

If you use dynamic host selection and have different usernames on the different selectable hosts, you can configure your $HOME/.ssh/config to handle username translation.

  • type: string (a valid username on the remote host)
  • default: (none)
15.1.5.1.13.3. [runtime] -> [[__NAME__]] -> [[[remote]]] -> retrieve job logs

Remote task job logs are saved to the suite run directory on the task host, not on the suite host. If you want the job logs pulled back to the suite host automatically, you can set this item to True. The suite will then attempt to rsync the job logs once from the remote host each time a task job completes. E.g. if the job file is ~/cylc-run/tut.oneoff.remote/log/job/1/hello/01/job, anything under ~/cylc-run/tut.oneoff.remote/log/job/1/hello/01/ will be retrieved.

  • type: boolean
  • default: False
15.1.5.1.13.4. [runtime] -> [[__NAME__]] -> [[[remote]]] -> retrieve job logs max size

If the disk space of the suite host is limited, you may want to set the maximum sizes of the job log files to retrieve. The value can be anything that is accepted by the --max-size=SIZE option of the rsync command.

  • type: string
  • default: None
15.1.5.1.13.5. [runtime] -> [[__NAME__]] -> [[[remote]]] -> retrieve job logs retry delays

Some batch systems have considerable delays between the time when the job completes and when it writes the job logs in its normal location. If this is the case, you can configure an initial delay and some retry delays between subsequent attempts. The default behaviour is to attempt once without any delay.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • default: (none)
  • example: retrieve job logs retry delays = PT10S, PT1M, PT5M
15.1.5.1.13.6. [runtime] -> [[__NAME__]] -> [[[remote]]] -> suite definition directory

The path to the suite configuration directory on the remote account, needed if remote tasks require access to files stored there (via $CYLC_SUITE_DEF_PATH) or in the suite bin directory (via $PATH). If this item is not defined, the local suite configuration directory path will be assumed, with the suite owner’s home directory, if present, replaced by '$HOME' for interpretation on the remote account.

  • type: string (a valid directory path on the remote account)
  • default: (local suite configuration path with $HOME replaced)
15.1.5.1.14. [runtime] -> [[__NAME__]] -> [[[events]]]

Cylc can call nominated event handlers when certain task events occur. This section configures specific task event handlers; see [cylc] -> [[events]] for suite events.

Event handlers can be located in the suite bin/ directory, otherwise it is up to you to ensure their location is in $PATH (in the shell in which the suite server program runs). They should require little resource to run and return quickly.

Each task event handler can be specified as a list of command lines or command line templates. They can contain any or all of the following patterns, which will be substituted with actual values:

  • %(event)s: event name
  • %(suite)s: suite name
  • %(suite_uuid)s: suite UUID string
  • %(point)s: cycle point
  • %(name)s: task name
  • %(submit_num)s: submit number
  • %(try_num)s: try number
  • %(id)s: task ID (i.e. %(name)s.%(point)s)
  • %(batch_sys_name)s: batch system name
  • %(batch_sys_job_id)s: batch system job ID
  • %(message)s: event message, if any
  • any task [meta] item, e.g.: - %(title)s: task title - %(URL)s: task URL - %(importance)s - example custom task metadata
  • any suite [meta] item, prefixed with “suite_”, e.g.: - %(suite_title)s: suite title - %(suite_URL)s: suite URL - %(suite_rating)s - example custom suite metadata

Otherwise, the command line will be called with the following default arguments:

<task-event-handler> %(event)s %(suite)s %(id)s %(message)s

Note

Substitution patterns should not be quoted in the template strings. This is done automatically where required.

For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation.

Additional information can be passed to event handlers via the [cylc] -> [[environment]] (but not via task runtime environments - event handlers are not called by tasks).

15.1.5.1.14.1. [runtime] -> [[__NAME__]] -> [[[events]]] -> EVENT handler

A list of one or more event handlers to call when one of the following EVENTs occurs:

  • submitted - the job submit command was successful
  • submission failed - the job submit command failed, or the submitted job was killed before it started executing
  • submission retry - job submit failed, but cylc will resubmit it after a configured delay
  • submission timeout - the submitted job timed out without commencing execution
  • started - the task reported commencement of execution
  • succeeded - the task reported that it completed successfully
  • failed - the task reported that if tailed to complete successfully
  • retry - the task failed, but cylc will resubmit it after a configured delay
  • execution timeout - the task timed out after execution commenced
  • warning - the task reported a WARNING severity message
  • critical - the task reported a CRITICAL severity message
  • custom - the task reported a CUSTOM severity message
  • late - the task is never active and is late

Item details: - type: Comma-separated list of strings (event handler scripts). - default: None - example: failed handler = my-failed-handler.sh

15.1.5.1.14.2. [runtime] -> [[__NAME__]] -> [[[events]]] -> submission timeout

If a task has not started after the specified ISO 8601 duration/interval, the submission timeout event handler(s) will be called.

  • type: ISO 8601 duration/interval representation (e.g. PT30M, 30 minutes or P1D, 1 day).
  • default: (none)
15.1.5.1.14.3. [runtime] -> [[__NAME__]] -> [[[events]]] -> execution timeout

If a task has not finished after the specified ISO 8601 duration/interval, the execution timeout event handler(s) will be called.

  • type: ISO 8601 duration/interval representation (e.g. PT4H, 4 hours or P1D, 1 day).
  • default: (none)
15.1.5.1.14.4. [runtime] -> [[__NAME__]] -> [[[events]]] -> handlers

Specify a list of command lines or command line templates as task event handlers.

  • type: Comma-separated list of strings (event handler command line or command line templates).
  • default: (none)
  • example: handlers = my-handler.sh
15.1.5.1.14.5. [runtime] -> [[__NAME__]] -> [[[events]]] -> handler events

Specify the events for which the general task event handlers should be invoked.

  • type: Comma-separated list of events
  • default: (none)
  • example: handler events = submission failed, failed
15.1.5.1.14.6. [runtime] -> [[__NAME__]] -> [[[events]]] -> handler retry delays

Specify an initial delay before running an event handler command and any retry delays in case the command returns a non-zero code. The default behaviour is to run an event handler command once without any delay.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • default: (none)
  • example: handler retry delays = PT10S, PT1M, PT5M
15.1.5.1.14.7. [runtime] -> [[__NAME__]] -> [[[events]]] -> mail events

Specify the events for which notification emails should be sent.

  • type: Comma-separated list of events
  • default: (none)
  • example: mail events = submission failed, failed
15.1.5.1.14.8. [runtime] -> [[__NAME__]] -> [[[events]]] -> mail from

Specify an alternate from: email address for event notifications.

15.1.5.1.14.9. [runtime] -> [[__NAME__]] -> [[[events]]] -> mail retry delays

Specify an initial delay before running the mail notification command and any retry delays in case the command returns a non-zero code. The default behaviour is to run the mail notification command once without any delay.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • default: (none)
  • example: mail retry delays = PT10S, PT1M, PT5M
15.1.5.1.14.10. [runtime] -> [[__NAME__]] -> [[[events]]] -> mail smtp

Specify the SMTP server for sending email notifications.

  • type: string
  • default: None, (localhost:25)
  • example: mail smtp = smtp.yourorg
15.1.5.1.14.11. [runtime] -> [[__NAME__]] -> [[[events]]] -> mail to

A list of email addresses to send task event notifications. The list can be anything accepted by the mail command.

  • type: string
  • default: None, (USER@HOSTNAME)
  • example: mail to = your.colleague
15.1.5.1.15. [runtime] -> [[__NAME__]] -> [[[environment]]]

The user defined task execution environment. Variables defined here can refer to cylc suite and task identity variables, which are exported earlier in the task job script, and variable assignment expressions can use cylc utility commands because access to cylc is also configured earlier in the script. See also Task Execution Environment.

You can also specify job environment templates here for parameterized tasks (see Parameterized Tasks).

Note

Prior to Cylc 7.8.7 (or prior to 7.9.2 if using the 7.9.x releases), parameter environment templates were in a separate section. However, this was changed to allow users to control the order of definition of the variables.

15.1.5.1.15.1. [runtime] -> [[__NAME__]] -> [[[environment]]] -> __VARIABLE__

Replace __VARIABLE__ with any number of environment variable assignment expressions. Order of definition is preserved so values can refer to previously defined variables. Values are passed through to the task job script without evaluation or manipulation by Cylc (with the exception of valid Python string templates that match parameterized tasks - see below), so any variable assignment expression that is legal in the job submission shell can be used. White space around the = is allowed (as far as cylc’s suite.rc parser is concerned these are just normal configuration items).

  • type: string
  • default: (none)
  • legal values: depends to some extent on the task job submission shell ([runtime] -> [[__NAME__]] -> [[[job]]] -> shell).
  • examples, for the bash shell:
    • FOO = $HOME/bar/baz
    • BAR = ${FOO}$GLOBALVAR
    • BAZ = $( echo "hello world" )
    • WAZ = ${FOO%.jpg}.png
    • NEXT_CYCLE = $( cylc cycle-point --offset=PT6H )
    • PREV_CYCLE = \`cylc cycle-point --offset=-PT6H`
    • ZAZ = "${FOO#bar}" # <-- QUOTED to escape the suite.rc comment character

To use parameter environment templates, replace __VARIABLE__ with pairs of environment variable name and Python string template for parameter substitution. This is only relevant for parameterized tasks - see Parameterized Tasks.

If specified, the job script will export the named variables specified here (in addition to the standard CYLC_TASK_PARAM_<key> variables), with the template strings substituted with the parameter values.

  • examples, for the bash shell:
    • MYNUM = %(i)d
    • MYITEM = %(item)s
    • MYFILE = /path/to/%(i)03d/%(item)s
15.1.5.1.16. [runtime] -> [[__NAME__]] -> [[[environment filter]]]

This section contains environment variable inclusion and exclusion lists that can be used to filter the inherited environment. This is not intended as an alternative to a well-designed inheritance hierarchy that provides each task with just the variables it needs. Filters can, however, improve suites with tasks that inherit a lot of environment they don’t need, by making it clear which tasks use which variables. They can optionally be used routinely as explicit “task environment interfaces” too, at some cost to brevity, because they guarantee that variables filtered out of the inherited task environment are not used.

Note

Environment filtering is done after inheritance is completely worked out, not at each level on the way, so filter lists in higher-level namespaces only have an effect if they are not overridden by descendants.

15.1.5.1.16.1. [runtime] -> [[__NAME__]] -> [[[environment filter]]] -> include

If given, only variables named in this list will be included from the inherited environment, others will be filtered out. Variables may also be explicitly excluded by an exclude list.

  • type: Comma-separated list of strings (variable names).
  • default: (none)
15.1.5.1.16.2. [runtime] -> [[__NAME__]] -> [[[environment filter]]] -> exclude

Variables named in this list will be filtered out of the inherited environment. Variables may also be implicitly excluded by omission from an include list.

  • type: Comma-separated list of strings (variable names).
  • default: (none)
15.1.5.1.17. [runtime] -> [[__NAME__]] -> [[[directives]]]

Batch queue scheduler directives. Whether or not these are used depends on the batch system. For the built-in methods that support directives (loadleveler, lsf, pbs, sge, slurm, moab), directives are written to the top of the task job script in the correct format for the method. Specifying directives individually like this allows use of default directives that can be individually overridden at lower levels of the runtime namespace hierarchy.

15.1.5.1.17.1. [runtime] -> [[__NAME__]] -> [[[directives]]] -> __DIRECTIVE__

Replace __DIRECTIVE__ with each directive assignment, e.g. class = parallel.

  • type: string
  • default: (none)

Example directives for the built-in batch system handlers are shown in Supported Job Submission Methods.

15.1.5.1.18. [runtime] -> [[__NAME__]] -> [[[outputs]]]

Register custom task outputs for use in message triggering in this section (Message Triggers)

15.1.5.1.18.1. [runtime] -> [[__NAME__]] -> [[[outputs]]] -> __OUTPUT__

Replace __OUTPUT__ with one or more custom task output messages (Message Triggers). The item name is used to select the custom output message in graph trigger notation.

  • type: string

  • default: (none)

  • examples:

    out1 = "sea state products ready"
    out2 = "NWP restart files completed"
    
15.1.5.1.19. [runtime] -> [[__NAME__]] -> [[[suite state polling]]]

Configure automatic suite polling tasks as described in Triggering Off Of Tasks In Other Suites. The items in this section reflect the options and defaults of the cylc suite-state command, except that the target suite name and the --task, --cycle, and --status options are taken from the graph notation.

15.1.5.1.19.1. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> run-dir

For your own suites the run database location is determined by your site/user config. For other suites, e.g. those owned by others, or mirrored suite databases, use this item to specify the location of the top level cylc run directory (the database should be a suite-name sub-directory of this location).

  • type: string (a directory path on the target suite host)
  • default: as configured by site/user config (for your own suites)
15.1.5.1.19.2. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> interval

Polling interval expressed as an ISO 8601 duration/interval.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT1M
15.1.5.1.19.3. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> max-polls

The maximum number of polls before timing out and entering the “failed” state.

  • type: integer
  • default: 10
15.1.5.1.19.4. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> user

Username of an account on the suite host to which you have access. The polling cylc suite-state command will be invoked on the remote account.

  • type: string (username)
  • default: (none)
15.1.5.1.19.5. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> host

The hostname of the target suite. The polling cylc suite-state command will be invoked on the remote account.

  • type: string (hostname)
  • default: (none)
15.1.5.1.19.6. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> message

Wait for the target task in the target suite to receive a specified message rather than achieve a state.

  • type: string (the message)
  • default: (none)
15.1.5.1.19.7. [runtime] -> [[__NAME__]] -> [[[suite state polling]]] -> verbose

Run the polling cylc suite-state command in verbose output mode.

  • type: boolean
  • default: False
15.1.5.1.20. [runtime] -> [[__NAME__]] -> [[[simulation]]]

Task configuration for the suite simulation and dummy run modes described in Simulating Suite Behaviour.

15.1.5.1.20.1. [runtime] -> [[__NAME__]] -> [[[simulation]]] -> default run length

The default simulated job run length, if [job]execution time limit and [simulation]speedup factor are not set.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT10S
15.1.5.1.20.2. [runtime] -> [[__NAME__]] -> [[[simulation]]] -> speedup factor

If [job]execution time limit is set, the task simulated run length is computed by dividing it by this factor.

  • type: float
  • default: (none) - i.e. do not use proportional run length
  • example: 10.0
15.1.5.1.20.3. [runtime] -> [[__NAME__]] -> [[[simulation]]] -> time limit buffer

For dummy jobs, a new [job]execution time limit is set to the simulated task run length plus this buffer interval, to avoid job kill due to exceeding the time limit.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT10S
15.1.5.1.20.4. [runtime] -> [[__NAME__]] -> [[[simulation]]] -> fail cycle points

Configure simulated or dummy jobs to fail at certain cycle points.

  • type: list of strings (cycle points), or all
  • default: (none) - no instances of the task will fail
  • examples: - all - all instance of the task will fail - 2017-08-12T06, 2017-08-12T18 - these instances of the task will fail
15.1.5.1.20.5. [runtime] -> [[__NAME__]] -> [[[simulation]]] -> fail try 1 only

If this is set to True only the first run of the task instance will fail, otherwise retries will fail too.

  • type: boolean
  • default: True
15.1.5.1.20.6. [runtime] -> [[__NAME__]] -> [[[simulation]]] -> disable task event handlers

If this is set to True configured task event handlers will not be called in simulation or dummy modes.

  • type: boolean
  • default: True

15.1.6. [visualization]

Configuration of suite graphing for the cylc graph command (graph extent, styling, and initial family-collapsed state) and the gcylc graph view (initial family-collapsed state). See the Graphviz documentation of node shapes.

15.1.6.1. [visualization] -> initial cycle point

The initial cycle point for graph plotting.

  • type: ISO 8601 date-time representation (e.g. CCYYMMDDThhmm)
  • default: the suite initial cycle point

The visualization initial cycle point gets adjusted up if necessary to the suite initial cycling point.

15.1.6.2. [visualization] -> final cycle point

An explicit final cycle point for graph plotting. If used, this overrides the preferred number of cycle points (below).

  • type: ISO 8601 date-time representation (e.g. CCYYMMDDThhmm)
  • default: (none)

The visualization final cycle point gets adjusted down if necessary to the suite final cycle point.

15.1.6.3. [visualization] -> number of cycle points

The number of cycle points to graph starting from the visualization initial cycle point. This is the preferred way of defining the graph end point, but it can be overridden by an explicit final cycle point (above).

  • type: integer
  • default: 3
15.1.6.4. [visualization] -> collapsed families

A list of family (namespace) names to be shown in the collapsed state (i.e. the family members will be replaced by a single family node) when the suite is first plotted in the graph viewer or the gcylc graph view. If this item is not set, the default is to collapse all families at first. Interactive GUI controls can then be used to group and ungroup family nodes at will.

  • type: Comma-separated list of family names.
  • default: (none)
15.1.6.5. [visualization] -> use node color for edges

Plot graph edges (dependency arrows) with the same color as the upstream node, otherwise default to black.

  • type: boolean
  • default: False
15.1.6.6. [visualization] -> use node fillcolor for edges

Plot graph edges (i.e. dependency arrows) with the same fillcolor as the upstream node, if it is filled, otherwise default to black.

  • type: boolean
  • default: False
15.1.6.7. [visualization] -> node penwidth

Line width of node shape borders.

  • type: integer
  • default: 2
15.1.6.8. [visualization] -> edge penwidth

Line width of graph edges (dependency arrows).

  • type: integer
  • default: 2
15.1.6.9. [visualization] -> use node color for labels

Graph node labels can be printed in the same color as the node outline.

  • type: boolean
  • default: False
15.1.6.10. [visualization] -> default node attributes

Set the default attributes (color and style etc.) of graph nodes (tasks and families). Attribute pairs must be quoted to hide the internal = character.

  • type: Comma-separated list of quoted 'attribute=value' pairs.
  • legal values: see graphviz or pygraphviz documentation
  • default: 'style=filled', 'fillcolor=yellow', 'shape=box'
15.1.6.11. [visualization] -> default edge attributes

Set the default attributes (color and style etc.) of graph edges (dependency arrows). Attribute pairs must be quoted to hide the internal = character.

  • type: Comma-separated list of quoted 'attribute=value' pairs.
  • legal values: see graphviz or pygraphviz documentation
  • default: 'color=black'
15.1.6.12. [visualization] -> [[node groups]]

Define named groups of graph nodes (tasks and families) which can styled en masse, by name, in [visualization] -> [[node attributes]]. Node groups are automatically defined for all task families, including root, so you can style family and member nodes at once by family name.

15.1.6.12.1. [visualization] -> [[node groups]] -> __GROUP__

Replace __GROUP__ with each named group of tasks or families.

  • type: Comma-separated list of task or family names.
  • default: (none)
  • example:
    • PreProc = foo, bar
    • PostProc = baz, waz
15.1.6.13. [visualization] -> [[node attributes]]

Here you can assign graph node attributes to specific nodes, or to all members of named groups defined in [visualization] -> [[node groups]]. Task families are automatically node groups. Styling of a family node applies to all member nodes (tasks and sub-families), but precedence is determined by ordering in the suite configuration. For example, if you style a family red and then one of its members green, cylc will plot a red family with one green member; but if you style one member green and then the family red, the red family styling will override the earlier green styling of the member.

15.1.6.13.1. [visualization] -> [[node attributes]] -> __NAME__

Replace __NAME__ with each node or node group for style attribute assignment.

  • type: Comma-separated list of quoted 'attribute=value' pairs.
  • legal values: see the Graphviz or PyGraphviz documentation
  • default: (none)
  • example (with reference to the node groups defined above):
    • PreProc = ‘style=filled’, ‘fillcolor=orange’
    • PostProc = ‘color=red’
    • foo = ‘style=filled’

15.2. Global (Site, User) Config File Reference

This section defines all legal items and values for cylc site and user config files. See Global (Site, User) Configuration Files for file locations, intended usage, and how to generate the files using the cylc get-site-config command.

As for suite configurations, Jinja2 expressions can be embedded in site and user config files to generate the final result parsed by cylc. Use of Jinja2 in suite configurations is documented in Jinja2.

15.2.1. Top Level Items

15.2.1.1. temporary directory

A temporary directory is needed by a few cylc commands, and is cleaned automatically on exit. Leave unset for the default (usually $TMPDIR).

  • type: string (directory path)
  • default: (none)
  • example: temporary directory = /tmp/$USER/cylc
15.2.1.2. process pool size

Maximum number of concurrent processes used to execute external job submission, event handlers, and job poll and kill commands - see Managing External Command Execution.

  • type: integer
  • default: 4
15.2.1.3. process pool timeout

Interval after which long-running commands in the process pool will be killed - see Managing External Command Execution.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT10M - note this is set quite high to avoid killing important processes when the system is under load.
15.2.1.4. disable interactive command prompts

Commands that intervene in running suites can be made to ask for confirmation before acting. Some find this annoying and ineffective as a safety measure, however, so command prompts are disabled by default.

  • type: boolean
  • default: True
15.2.1.5. task host select command timeout

When a task host in a suite is a shell command string, cylc calls the shell to determine the task host. This call is invoked by the main process, and may cause the suite to hang while waiting for the command to finish. This setting sets a timeout for such a command to ensure that the suite can continue.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT10S

15.2.2. [task messaging]

This section contains configuration items that affect task-to-suite communications.

15.2.2.1. [retry interval][task messaging] -> retry interval

If a send fails, the messaging code will retry after a configured delay interval.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT5S
15.2.2.2. [maximum number of tries][task messaging] -> maximum number of tries

If successive sends fail, the messaging code will give up after a configured number of tries.

  • type: integer
  • minimum: 1
  • default: 7
15.2.2.3. [connection timeout][task messaging] -> connection timeout

This is the same as the --comms-timeout option in cylc commands. Without a timeout remote connections to unresponsive suites can hang indefinitely (suites suspended with Ctrl-Z for instance).

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT30S

15.2.3. [suite logging]

The suite event log, held under the suite run directory, is maintained as a rolling archive. Logs are rolled over (backed up and started anew) when they reach a configurable limit size.

15.2.3.1. [rolling archive length][suite logging] -> rolling archive length

How many rolled logs to retain in the archive.

  • type: integer
  • minimum: 1
  • default: 5
15.2.3.2. maximum size in bytes][suite logging] -> maximum size in bytes

Suite event logs are rolled over when they reach this file size.

  • type: integer
  • default: 1000000

15.2.4. [documentation]

Documentation locations for the cylc doc command and gcylc Help menus.

15.2.4.1. [documentation] -> [[files]]

File locations of documentation held locally on the cylc host server.

15.2.4.1.1. [documentation] -> [[files]] -> html index

File location of the main cylc documentation index.

  • type: string
  • default: <cylc-dir>/doc/index.html
15.2.4.1.2. [documentation] -> [[files]] -> pdf user guide

File location of the cylc User Guide, PDF version.

  • type: string
  • default: <cylc-dir>/doc/cug-pdf.pdf
15.2.4.1.3. [documentation] -> [[files]] -> multi-page html user guide

File location of the cylc User Guide, multi-page HTML version.

  • type: string
  • default: <cylc-dir>/doc/html/multi/cug-html.html
15.2.4.1.4. [documentation] -> [[files]] -> single-page html user guide

File location of the cylc User Guide, single-page HTML version.

  • type: string
  • default: <cylc-dir>/doc/html/single/cug-html.html
15.2.4.2. [documentation] -> [[urls]]

Online documentation URLs.

15.2.4.2.1. [documentation] -> [[urls]] -> internet homepage

URL of the cylc internet homepage, with links to documentation for the latest official release.

15.2.4.2.2. [documentation] -> [[urls]] -> local index

Local intranet URL of the main cylc documentation index.

  • type: string
  • default: (none)

15.2.5. [document viewers]

PDF and HTML viewers can be launched by cylc to view the documentation.

15.2.5.1. [document viewers] -> pdf

Your preferred PDF viewer program.

  • type: string
  • default: evince
15.2.5.2. [document viewers] -> html

Your preferred web browser.

  • type: string
  • default: firefox

15.2.6. [editors]

Choose your favourite text editor for editing suite configurations.

15.2.6.1. [editors] -> terminal

The editor to be invoked by the cylc command line interface.

  • type: string
  • default: vim
  • examples: - terminal = emacs -nw (emacs non-GUI) - terminal = emacs (emacs GUI) - terminal = gvim -f (vim GUI)
15.2.6.2. [editors] -> gui

The editor to be invoked by the cylc GUI.

  • type: string
  • default: gvim -f
  • examples: - gui = emacs - gui = xterm -e vim

15.2.7. [communication]

This section covers options for network communication between cylc clients (suite-connecting commands and guis) servers (running suites). Each suite listens on a dedicated network port, binding on the first available starting at the configured base port.

By default, the communication method is HTTPS secured with HTTP Digest Authentication. If the system does not support SSL, you should configure this section to use HTTP. Cylc will not automatically fall back to HTTP if HTTPS is not available.

15.2.7.1. [communication] -> method

The choice of client-server communication method - currently only HTTPS and HTTP are supported, although others could be developed and plugged in. Cylc defaults to HTTPS if this setting is not explicitly configured.

  • type: string
  • options: - https - http
  • default: https
15.2.7.2. [communication] -> base port

The first port that Cylc is allowed to use. This item (and maximum number of ports) is deprecated; please use run ports under [suite servers] instead.

  • type: integer
  • default: 43001
15.2.7.3. [communication] -> maximum number of ports

This setting (and base port) is deprecated; please use run ports under [suite servers] instead.

  • type: integer
  • default: 100
15.2.7.4. [communication] -> proxies on

Enable or disable proxy servers for HTTPS - disabled by default.

  • type: boolean
  • localhost default: False
15.2.7.5. [communication] -> options

Option flags for the communication method. Currently only ‘SHA1’ is supported for HTTPS, which alters HTTP Digest Auth to use the SHA1 hash algorithm rather than the standard MD5. This is more secure but is also less well supported by third party web clients including web browsers. You may need to add the ‘SHA1’ option if you are running on platforms where MD5 is discouraged (e.g. under FIPS).

  • type: string_list
  • default: []
  • options: - SHA1

15.2.8. [monitor]

Configurable settings for the command line cylc monitor tool.

15.2.8.1. [monitor] -> sort order

The sort order for tasks in the monitor view.

  • type: string
  • options:
    • alphanumeric
    • definition - the order that tasks appear under [runtime] in the suite configuration.
  • default: definition

15.2.9. [hosts]

The [hosts] section configures some important host-specific settings for the suite host (“localhost”) and remote task hosts.

Note

Remote task behaviour is determined by the site/user config on the suite host, not on the task host.

Suites can specify task hosts that are not listed here, in which case local settings will be assumed, with the local home directory path, if present, replaced by $HOME in items that configure directory locations.

15.2.9.1. [hosts] -> [[HOST]]

The default task host is the suite host, localhost, with default values as listed below. Use an explicit [hosts][[localhost]] section if you need to override the defaults. Localhost settings are then also used as defaults for other hosts, with the local home directory path replaced as described above. This applies to items omitted from an explicit host section, and to hosts that are not listed at all in the site and user config files. Explicit host sections are only needed if the automatically modified local defaults are not sufficient.

Host section headings can also be regular expressions to match multiple hostnames.

Note

The general regular expression wildcard is '.*' (zero or more of any character), not '*'. Hostname matching regular expressions are used as-is in the Python re.match() function.

As such they match from the beginning of the hostname string (as specified in the suite configuration) and they do not have to match through to the end of the string (use the string-end matching character '$' in the expression to force this).

A hierarchy of host match expressions from specific to general can be used because config items are processed in the order specified in the file.

  • type: string (hostname or regular expression)
  • examples: - server1.niwa.co.nz - explicit host name - server\d.niwa.co.nz - regular expression
15.2.9.1.1. [hosts] -> [[HOST]] -> run directory

The top level for suite logs and service files, etc. Can contain $HOME or $USER but not other environment variables (the item cannot actually be evaluated by the shell on HOST before use, but the remote home directory is where rsync and ssh naturally land, and the remote username is known by the suite server program).

  • type: string (directory path)
  • default: $HOME/cylc-run
  • example: /nfs/data/$USER/cylc-run
15.2.9.1.2. [hosts] -> [[HOST]] -> work directory

The top level for suite work and share directories. Can contain $HOME or $USER but not other environment variables (the item cannot actually be evaluated by the shell on HOST before use, but the remote home directory is where rsync and ssh naturally land, and the remote username is known by the suite server program).

  • type: string (directory path)
  • localhost default: $HOME/cylc-run
  • example: /nfs/data/$USER/cylc-run
15.2.9.1.3. [hosts] -> [[HOST]] -> task communication method

The means by which task progress messages are reported back to the running suite. See above for default polling intervals for the poll method.

  • type: string (must be one of the following three options)
  • options: - default - direct client-server communication via network ports - ssh - use ssh to re-invoke the messaging commands on the suite server - poll - the suite polls for the status of tasks (no task messaging)
  • localhost default: default
15.2.9.1.4. [hosts] -> [[HOST]] -> execution polling intervals

Cylc can poll running jobs to catch problems that prevent task messages from being sent back to the suite, such as hard job kills, network outages, or unplanned task host shutdown. Routine polling is done only for the polling task communication method (below) unless suite-specific polling is configured in the suite configuration. A list of interval values can be specified, with the last value used repeatedly until the task is finished - this allows more frequent polling near the beginning and end of the anticipated task run time. Multipliers can be used as shorthand as in the example below.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default:
  • example: execution polling intervals = 5*PT1M, 10*PT5M, 5*PT1M
15.2.9.1.5. [hosts] -> [[HOST]] -> submission polling intervals

Cylc can also poll submitted jobs to catch problems that prevent the submitted job from executing at all, such as deletion from an external batch scheduler queue. Routine polling is done only for the polling task communication method (above) unless suite-specific polling is configured in the suite configuration. A list of interval values can be specified as for execution polling (above) but a single value is probably sufficient for job submission polling.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default:
  • example: (see the execution polling example above)
15.2.9.1.6. [hosts] -> [[HOST]] -> scp command

A string for the command used to copy files to a remote host. This is not used on the suite host unless you run local tasks under another user account. The value is assumed to be scp with some initial options or a command that implements a similar interface to scp.

  • type: string
  • localhost default: scp -oBatchMode=yes -oConnectTimeout=10
15.2.9.1.7. [hosts] -> [[HOST]] -> ssh command

A string for the command used to invoke commands on this host. This is not used on the suite host unless you run local tasks under another user account. The value is assumed to be ssh with some initial options or a command that implements a similar interface to ssh.

  • type: string
  • localhost default: ssh -oBatchMode=yes -oConnectTimeout=10
15.2.9.1.8. [hosts] -> [[HOST]] -> use login shell

Whether to use a login shell or not for remote command invocation. By default cylc runs remote ssh commands using a login shell:

ssh user@host 'bash --login cylc ...'

which will source /etc/profile and ~/.profile to set up the user environment. However, for security reasons some institutions do not allow unattended commands to start login shells, so you can turn off this behaviour to get:

ssh user@host 'cylc ...'

which will use the default shell on the remote machine, sourcing ~/.bashrc (or ~/.cshrc) to set up the environment.

  • type: boolean
  • localhost default: True
15.2.9.1.9. [hosts] -> [[HOST]] -> cylc executable

The cylc executable on a remote host.

Note

This should normally point to the cylc multi-version wrapper (see User Interfaces) on the host, not bin/cylc for a specific installed version.

Specify a full path if cylc is not in \$PATH when it is invoked via ssh on this host.

  • type: string
  • localhost default: cylc
15.2.9.1.10. [hosts] -> [[HOST]] -> global init-script

If specified, the value of this setting will be inserted to just before the init-script section of all job scripts that are to be submitted to the specified remote host.

  • type: string
  • localhost default: ""
15.2.9.1.11. [hosts] -> [[HOST]] -> copyable environment variables

A list containing the names of the environment variables that can and/or need to be copied from the suite server program to a job.

  • type: string_list
  • localhost default: []
15.2.9.1.12. [hosts] -> [[HOST]] -> retrieve job logs

Global default for the [runtime] -> [[__NAME__]] -> [[[remote]]] -> retrieve job logs setting for the specified host.

15.2.9.1.13. [hosts] -> [[HOST]] -> retrieve job logs command

If rsync -a is unavailable or insufficient to retrieve job logs from a remote host, you can use this setting to specify a suitable command.

  • type: string
  • default: rsync -a
15.2.9.1.14. [hosts] -> [[HOST]] -> retrieve job logs max size

Global default for the [runtime] -> [[__NAME__]] -> [[[remote]]] -> retrieve job logs max size setting for the specified host.

15.2.9.1.15. [hosts] -> [[HOST]] -> retrieve job logs retry delays

Global default for the [runtime] -> [[__NAME__]] -> [[[remote]]] -> retrieve job logs retry delays setting for the specified host.

15.2.9.1.16. [hosts] -> [[HOST]] -> task event handler retry delays

Host specific default for the [runtime] -> [[__NAME__]] -> [[[events]]] -> handler retry delays setting.

15.2.9.1.17. [hosts] -> [[HOST]] -> tail command template

A command template (with %(filename)s substitution) to tail-follow job logs on HOST, by the GUI log viewer and cylc cat-log. You are unlikely to need to override this.

  • type: string
  • default: tail -n +1 -F %(filename)s
15.2.9.1.18. [hosts] -> [[HOST]] -> [[[batch systems]]]

Settings for particular batch systems on HOST. In the subsections below, SYSTEM should be replaced with the cylc batch system handler name that represents the batch system (see [runtime] -> [[__NAME__]] -> [[[job]]] -> batch system).

15.2.9.1.18.1. [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> err tailer

A command template (with %(job_id)s substitution) that can be used to tail-follow the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overrides [hosts] -> [[HOST]] -> tail command template above.

  • type: string
  • default: (none)
  • example: For PBS:
[hosts]
    [[ myhpc*]]
        [[[batch systems]]]
            [[[[pbs]]]]
                err tailer = qcat -f -e %(job_id)s
                out tailer = qcat -f -o %(job_id)s
                err viewer = qcat -e %(job_id)s
                out viewer = qcat -o %(job_id)s
15.2.9.1.18.2. [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> out tailer

A command template (with %(job_id)s substitution) that can be used to tail-follow the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overrides [hosts] -> [[HOST]] -> tail command template above.

15.2.9.1.18.3. [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> err viewer

A command template (with %(job_id)s substitution) that can be used to view the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running.

15.2.9.1.18.4. [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> out viewer

A command template (with %(job_id)s substitution) that can be used to view the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running.

15.2.9.1.18.5. [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> job name length maximum

The maximum length for job name acceptable by a batch system on a given host. Currently, this setting is only meaningful for PBS jobs. For example, PBS 12 or older will fail a job submit if the job name has more than 15 characters, which is the default setting. If you have PBS 13 or above, you may want to modify this setting to a larger value.

  • type: integer
  • default: (none)
  • example: For PBS:
[hosts]
    [[myhpc*]]
        [[[batch systems]]]
            [[[[pbs]]]]
                # PBS 13
                job name length maximum = 236
15.2.9.1.18.6. [hosts] -> [[HOST]] -> [[[batch systems]]] -> [[[[SYSTEM]]]] -> execution time limit polling intervals

The intervals between polling after a task job (submitted to the relevant batch system on the relevant host) exceeds its execution time limit. The default setting is PT1M, PT2M, PT7M. The accumulated times (in minutes) for these intervals will be roughly 1, 1 + 2 = 3 and 1 + 2 + 7 = 10 after a task job exceeds its execution time limit.

  • type: Comma-separated list of ISO 8601 duration/interval representations, optionally preceded by multipliers.
  • default: PT1M, PT2M, PT7M
  • example:
[hosts]
    [[myhpc*]]
        [[[batch systems]]]
            [[[[pbs]]]]
                execution time limit polling intervals = 5*PT2M

15.2.10. [suite servers]

Configure allowed suite hosts and ports for starting up (running or restarting) suites and enabling them to be detected whilst running via utilities such as cylc gscan. Additionally configure host selection settings specifying how to determine the most suitable run host at any given time from those configured.

15.2.10.1. [suite servers] -> auto restart delay

Relates to Cylc’s auto stop-restart mechanism (see Auto Stop-Restart). When a host is set to automatically shutdown/restart it will first wait a random period of time between zero and auto restart delay seconds before beginning the process. This is to prevent large numbers of suites from restarting simultaneously.

  • type: integer
  • default: 0
15.2.10.2. [suite servers] -> condemned hosts

Hosts specified in condemned hosts will not be considered as suite run hosts. If suites are already running on condemned hosts they will be automatically shutdown and restarted (see Auto Stop-Restart).

  • type: comma-separated list of host names and/or IP addresses.
  • default: (none)
15.2.10.3. [suite servers] -> run hosts

A list of allowed suite run hosts. One of these hosts will be appointed for a suite to start up on if an explicit host is not provided as an option to a run or restart command.

  • type: comma-separated list of host names and/or IP addresses.
  • default: localhost
15.2.10.4. [suite servers] -> scan hosts

A list of hosts to scan for running suites.

  • type: comma-separated list of host names and/or IP addresses.
  • default: localhost
15.2.10.5. [suite servers] -> run ports

A list of allowed ports for Cylc to use to run suites.

Note

Only one suite can run per port for a given host, so the length of this list determines the maximum number of suites that can run at once per suite host.

This config item supersedes the deprecated settings base port and maximum number of ports, where the base port is equivalent to the first port, and the maximum number of ports to the length, of this list.

  • type: string in the format X .. Y for X <= Y where X and Y are integers.
  • default: 43001 .. 43100 (equivalent to the list 43001, 43002, ... , 43099, 43100)
15.2.10.6. [suite servers] -> scan ports

A list of ports to scan for running suites on each host set in scan hosts.

  • type: string in the format X .. Y for X <= Y where X and Y are integers.
  • default: 43001 .. 43100 (equivalent to the list 43001, 43002, ... , 43099, 43100)
15.2.10.7. [suite servers] -> [[run host select]]

Configure thresholds for excluding insufficient hosts and a method for ranking the remaining hosts to be applied in selection of the most suitable run host, from those configured, at start-up whenever a set host is not specified on the command line via the --host= option.

15.2.10.7.1. [suite servers] -> [[run host select]] -> rank

The method to use to rank the run host list in order of suitability.

  • type: string (which must be one of the options outlined below)
  • default: random
  • options:
    • random - shuffle the hosts to select a host at random
    • load:1 - rank and select for the lowest load average over 1 minute (as given by the uptime command)
    • load:5 - as for load:1 above, but over 5 minutes
    • load:15 - as for load:1 above, but over 15 minutes
    • memory - rank and select for the highest usable memory i.e.
      free memory plus memory in the buffer cache (‘buffers’) and in the page cache (‘cache’), as specified under /proc/meminfo
    • disk-space:PATH - rank and select for the highest free disk
      space for a given mount directory path PATH as given by the df command, where multiple paths may be specified individually i.e. via disk-space:PATH_1 and disk-space:PATH_2, etc.
  • default: (none)
15.2.10.7.2. [suite servers] -> [[run host select]] -> thresholds

A list of thresholds i.e. cutoff values which run hosts must meet in order to be considered as a possible run host. Each threshold is a minimum or a maximum requirement depending on the context of the measure; usable memory (memory) and free disk space (disk-space:PATH) threshold values set a minimum value, which must be exceeded, whereas load average (load:1, load:5 and load:15) threshold values set a maximum, which must not be. Failure to meet a threshold results in exclusion from the list of hosts that undergo ranking to determine the best host which becomes the run host.

  • type: string in format MEASURE_1 CUTOFF_1; ... ;MEASURE_n CUTOFF_n (etc), where each MEASURE_N is one of the options below (note these correspond to all the rank methods accepted under the rank setting except for random which does not make sense as a threshold measure). Spaces delimit corresponding measures and their values, while semi-colons (optionally with subsequent spaces) delimit each measure-value pair.
  • options:
    • load:1 - load average over 1 minute (as given by the uptime command)
    • load:5 - as for load:1 above, but over 5 minutes
    • load:15 - as for load:1 above, but over 15 minutes
    • memory - usable memory i.e. free memory plus memory in the buffer cache (‘buffers’) and in the page cache (‘cache’), in KB, as specified under /proc/meminfo
    • disk-space:PATH - free disk space for a given mount directory path PATH, in KB, as given by the df command, where multiple paths may be specified individually i.e. via disk-space:PATH_1 and disk-space:PATH_2, etc.
  • default: (none)
  • examples:
    • thresholds = memory 2000 (set a minimum of 2000 KB in usable memory for possible run hosts)
    • thresholds = load:5 0.5; load:15 1.0; disk-space:/ 5000 (set a maximum of 0.5 and 1.0 for load averages over 5 and 15 minutes respectively and a minimum of 5000 KB of free disk-space on the / mount directory. If any of these thresholds are not met by a host, it will be excluded for running a suite on.)

15.2.11. [suite host self-identification]

The suite host’s identity must be determined locally by cylc and passed to running tasks (via $CYLC_SUITE_HOST) so that task messages can target the right suite on the right host.

15.2.11.1. [suite host self-identification] -> method

This item determines how cylc finds the identity of the suite host. For the default name method cylc asks the suite host for its host name. This should resolve on remote task hosts to the IP address of the suite host; if it doesn’t, adjust network settings or use one of the other methods. For the address method, cylc attempts to use a special external “target address” to determine the IP address of the suite host as seen by remote task hosts (in-source documentation in <cylc-dir>/lib/cylc/hostuserutil.py explains how this works). And finally, as a last resort, you can choose the hardwired method and manually specify the host name or IP address of the suite host.

  • type: string
  • options:
    • name - self-identified host name
    • address - automatically determined IP address (requires target, below)
    • hardwired - manually specified host name or IP address (requires host, below)
  • default: name
15.2.11.2. [suite host self-identification] -> target

This item is required for the address self-identification method. If your suite host sees the internet, a common address such as google.com will do; otherwise choose a host visible on your intranet.

  • type: string (an inter- or intranet URL visible from the suite host)
  • default: google.com
15.2.11.3. [suite host self-identification] -> host

Use this item to explicitly set the name or IP address of the suite host if you have to use the hardwired self-identification method.

  • type: string (host name or IP address)
  • default: (none)

15.2.12. [task events]

Global site/user defaults for [runtime] -> [[__NAME__]] -> [[[events]]].

15.2.13. [test battery]

Settings for the automated development tests.

Note

The test battery reads <cylc-dir>/etc/global-tests.rc instead of the normal site/user global config files.

15.2.13.1. [test battery] -> remote host with shared fs

The name of a remote host that sees the same HOME file system as the host running the test battery.

15.2.13.2. [test battery] -> remote host

Host name of a remote account that does not see the same home directory as the account running the test battery - see also “remote owner” below).

15.2.13.3. [test battery] -> remote owner

User name of a remote account that does not see the same home directory as the account running the test battery - see also “remote host” above).

15.2.13.4. [test battery] -> [[batch systems]]

Settings for testing supported batch systems (job submission methods). The tests for a batch system are only performed if the batch system is available on the test host or a remote host accessible via SSH from the test host.

15.2.13.4.1. [test battery] -> [[batch systems]] -> [[[SYSTEM]]]

SYSTEM is the name of a supported batch system with automated tests. This can currently be “loadleveler”, “lsf”, “pbs”, “sge” and/or “slurm”.

15.2.13.4.1.1. [test battery] -> [[batch systems]] -> [[[SYSTEM]]] -> host

The name of a host where commands for this batch system is available. Use “localhost” if the batch system is available on the host running the test battery. Any specified remote host should be accessible via SSH from the host running the test battery.

15.2.13.4.1.2. [test battery] -> [[batch systems]] -> [[[SYSTEM]]] -> err viewer

The command template (with \%(job_id)s substitution) for testing the run time stderr viewer functionality for this batch system.

15.2.13.4.1.3. [test battery] -> [[batch systems]] -> [[[SYSTEM]]] -> out viewer

The command template (with \%(job_id)s substitution) for testing the run time stdout viewer functionality for this batch system.

15.2.13.4.1.4. [test battery] -> [[batch systems]] -> [[[SYSTEM]]] -> [[[[directives]]]]

The minimum set of directives that must be supplied to the batch system on the site to initiate jobs for the tests.

15.2.14. [cylc]

Default values for entries in the suite.rc [cylc] section.

15.2.14.1. [cylc] -> UTC mode

Allows you to set a default value for UTC mode in a suite at the site level. See [cylc] -> UTC mode for details.

15.2.14.2. [cylc] -> health check interval

Site default suite health check interval. See [cylc] -> health check interval for details.

15.2.14.3. [cylc] -> task event mail interval

Site default task event mail interval. See [cylc] -> task event mail interval for details.

15.2.14.4. [cylc] -> [[events]]

You can define site defaults for each of the following options, details of which can be found under [cylc] -> [[events]]:

15.2.14.4.1. [cylc] -> [[events]] -> handlers
15.2.14.4.2. [cylc] -> [[events]] -> handler events
15.2.14.4.3. [cylc] -> [[events]] -> startup handler
15.2.14.4.4. [cylc] -> [[events]] -> shutdown handler
15.2.14.4.5. [cylc] -> [[events]] -> aborted handler
15.2.14.4.6. [cylc] -> [[events]] -> mail events
15.2.14.4.8. [cylc] -> [[events]] -> mail from
15.2.14.4.9. [cylc] -> [[events]] -> mail smtp
15.2.14.4.10. [cylc] -> [[events]] -> mail to
15.2.14.4.11. [cylc] -> [[events]] -> timeout handler
15.2.14.4.12. [cylc] -> [[events]] -> timeout
15.2.14.4.13. [cylc] -> [[events]] -> abort on timeout
15.2.14.4.14. [cylc] -> [[events]] -> stalled handler
15.2.14.4.15. [cylc] -> [[events]] -> abort on stalled
15.2.14.4.16. [cylc] -> [[events]] -> inactivity handler
15.2.14.4.17. [cylc] -> [[events]] -> inactivity
15.2.14.4.18. [cylc] -> [[events]] -> abort on inactivity

15.2.15. [authentication]

Authentication of client programs with suite server programs can be configured here, and overridden in suites if necessary (see [cylc] -> [[authentication]]).

The suite-specific passphrase must be installed on a user’s account to authorize full control privileges (see Suite Passphrases and Client-Server Interaction). In the future we plan to move to a more traditional user account model so that each authorized user can have their own password.

15.2.15.1. [authentication] -> public

This sets the client privilege level for public access - i.e. no suite passphrase required.

  • type: string (must be one of the following options)
  • options:
    • identity - only suite and owner names revealed
    • description - identity plus suite title and description
    • state-totals - identity, description, and task state totals
    • full-read - full read-only access for monitor and GUI
    • shutdown - full read access plus shutdown, but no other control.
  • default: state-totals

15.3. Gcylc GUI (cylc gui) Config File Reference

This section defines all legal items and values for the gcylc user config file, which should be located in $HOME/.cylc/gcylc.rc. Current settings can be printed with the cylc get-gui-config command.

15.3.1. Top Level Items

15.3.1.1. dot icon size

Set the size of the task state dot icons displayed in the text and dot views.

  • type: string
  • legal values: small (10px), medium (14px), large (20px), extra large (30px)
  • default: medium
15.3.1.2. initial side-by-side views

Set the suite view panels initial orientation when the GUI starts. This can be changed later using the “View” menu “Toggle views side-by-side” option.

  • type: boolean (False or True)
  • default: False
15.3.1.3. initial views

Set the suite view panel(s) displayed initially, when the GUI starts. This can be changed later using the tool bar.

  • type: string (a list of one or two view names)
  • legal values: text, dot, graph
  • default: text
  • example: initial views = graph, dot
15.3.1.4. maximum update interval

Set the maximum (longest) time interval between calls to the suite for data update.

The update frequency of the GUI is variable. It is determined by considering the time of last update and the mean duration of the last 10 main loops of the suite.

In general, the GUI will use an update frequency that matches the mean duration of the suite’s main loop. In quiet time (or if the suite is not contactable), it will gradually increase the update interval (i.e. reduce the update frequency) to a maximum determined by this setting.

Increasing this setting will reduce the network traffic and hits on the suite process. However, if a quiet suite starts to pick up activity, the GUI may initially appear out of sync with what is happening in the suite for the duration of this interval.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT15S
15.3.1.5. sort by definition order

If this is not turned off the default sort order for task names and families in the dot and text views will the order they appear in the suite definition. Clicking on the task name column in the treeview will toggle to alphanumeric sort, and a View menu item does the same for the dot view. If turned off, the default sort order is alphanumeric and definition order is not available at all.

  • type: boolean
  • default: True
15.3.1.6. sort column

If text is in initial views then sort column sets the column that will be sorted initially when the GUI launches. Sorting can be changed later by clicking on the column headers.

  • type: string
  • legal values: task, state, host, job system, job ID, T-submit, T-start, T-finish, dT-mean, latest message, none
  • default: none
  • example: sort column = T-start
15.3.1.7. sort column ascending

For use in combination with sort column, sets whether the column will be sorted using ascending or descending order.

  • type: boolean
  • default: True
  • example: sort column ascending = False
15.3.1.8. sub-graphs on

Set the sub-graphs view to be enabled by default. This can be changed later using the toggle options for the graph view.

  • type: boolean (False or True)
  • default: False
15.3.1.9. task filter highlight color

The color used to highlight active task filters in gcylc. It must be a name from the X11 rgb.txt file, e.g. SteelBlue; or a quoted hexadecimal color code, e.g. "#ff0000" for red (quotes are required to prevent the hex code being interpreted as a comment).

  • type: string
  • default: PowderBlue
15.3.1.10. task states to filter out

Set the initial filtering options when the GUI starts. Later this can be changed by using the “View” menu “Task Filtering” option.

  • type: string list
  • legal values: waiting, held, queued, ready, expired, submitted, submit-failed, submit-retrying, running, succeeded, failed, retrying, runahead
  • default: runahead
15.3.1.11. transpose dot

Transposes the content in dot view so that it displays from left to right rather than from top to bottom. Can be changed later using the options submenu available via the view menu.

  • type: boolean
  • default: False
  • example: transpose dot = True
15.3.1.12. transpose graph

Transposes the content in graph view so that it displays from left to right rather than from top to bottom. Can be changed later using the options submenu via the view menu.

  • type: boolean
  • default: False
  • example: transpose graph = True
15.3.1.13. ungrouped views

List suite views, if any, that should be displayed initially in an ungrouped state. Namespace family grouping can be changed later using the tool bar.

  • type: string (a list of zero or more view names)
  • legal values: text, dot, graph
  • default: (none)
  • example: ungrouped views = text, dot
15.3.1.14. use theme

Set the task state color theme, common to all views, to use initially. The color theme can be changed later using the tool bar. See etc/gcylc.rc.eg and etc/gcylc-themes.rc in the Cylc installation directory for how to modify existing themes or define your own. Use cylc get-gui-config to list available themes.

  • type: string (theme name)
  • legal values: default, solid, high-contrast, color-blind, and any custom or user-modified themes.
  • default: default
15.3.1.15. window size

Sets the size (in pixels) of the cylc GUI at startup.

  • type: integer list: x, y
  • legal values: positive integers
  • default: 800, 500
  • example: window size = 1000, 700

15.3.2. [themes]

This section may contain task state color theme definitions.

15.3.2.1. [themes] -> [[THEME]]

The name of the task state color-theme to be defined in this section.

  • type: string
15.3.2.1.1. [themes] -> [[THEME]] -> inherit

You can inherit from another theme in order to avoid defining all states.

  • type: string (parent theme name)
  • default: default
15.3.2.1.2. [themes] -> [[THEME]] -> defaults

Set default icon attributes for all state icons in this theme.

  • type: string list (icon attributes)
  • legal values: "color=COLOR", "style=STYLE", "fontcolor=FONTCOLOR"
  • default: (none)

For the attribute values, COLOR and FONTCOLOR can be color names from the X11 rgb.txt file, e.g. SteelBlue; or hexadecimal color codes, e.g. #ff0000 for red; and STYLE can be filled or unfilled. See etc/gcylc.rc.eg and etc/gcylc-themes.rc in the Cylc installation directory for examples.

15.3.2.1.3. [themes] -> [[THEME]] -> STATE

Set icon attributes for all task states in THEME, or for a subset of them if you have used theme inheritance and/or defaults. Legal values of STATE are any of the cylc task proxy states: waiting, runahead, held, queued, ready, submitted, submit-failed, running, succeeded, failed, retrying, submit-retrying.

  • type: string list (icon attributes)
  • legal values: "color=COLOR", "style=STYLE", "fontcolor=FONTCOLOR"
  • default: (none)

For the attribute values, COLOR and FONTCOLOR can be color names from the X11 rgb.txt file, e.g. SteelBlue; or hexadecimal color codes, e.g. #ff0000 for red; and STYLE can be filled or unfilled. See etc/gcylc.rc.eg and etc/gcylc-themes.rc in the Cylc installation directory for examples.

15.4. Gscan GUI (cylc gscan) Config File Reference

This section defines all legal items and values for the gscan config file which should be located in $HOME/.cylc/gscan.rc. Some items also affect the gpanel panel app.

The main menubar can be hidden to maximise the display area. Its visibility can be toggled via the mouse right-click menu, or by typing Alt-m. When visible, the main View menu allows you to change properties such as the columns that are displayed, which hosts to scan for running suites, and the task state icon theme.

At startup, the task state icon theme and icon size are taken from the gcylc config file $HOME/.cylc/gcylc.rc.

15.4.1. Top Level Items

15.4.1.1. activate on startup

Set whether cylc gpanel will activate automatically when the GUI is loaded or not.

  • type: boolean (True or False)
  • legal values: True, False
  • default: False
  • example: activate on startup = True
15.4.1.2. columns

Set the columns to display when the cylc gscan GUI starts. This can be changed later with the View menu. The order in which the columns are specified here does not affect the display order.

  • type: string (a list of one or more view names)
  • legal values: host, owner, status, suite, title, updated
  • default: status, suite
  • example: columns = suite, title, status
15.4.1.3. suite listing update interval

Set the time interval between refreshing the suite listing (by file system or port range scan).

Increasing this setting will reduce the frequency of gscan looking for running suites. Scanning for suites by port range scan can be a hit on the network and the running suite processes, while scanning for suites by walking the file system can hit the file system (especially if the file system is a network file system). Therefore, this is normally set with a lower frequency than the status update interval. Increasing this setting will make gscan friendlier to the network and/or the file system, but gscan may appear out of sync if there are many start up or shut down of suites between the intervals.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT1M
15.4.1.4. suite status update interval

Set the time interval between calls to known running suites (suites that are known via the latest suite listing) for data updates.

Increasing this setting will reduce the network traffic and hits on the suite processes. However, gscan may appear out of sync with what may be happening in very busy suites.

  • type: ISO 8601 duration/interval representation (e.g. PT10S, 10 seconds, or PT1M, 1 minute).
  • default: PT15S
15.4.1.5. window size

Sets the size in pixels of the cylc gscan GUI window at startup.

  • type: integer list: x, y
  • legal values: positive integers
  • default: 300, 200
  • example: window size = 1000, 700
15.4.1.6. hide main menubar

Hide the main menubar of the cylc gscan GUI window at startup. By default, the menubar is not hidden. Either way, you can toggle its visibility with Alt-m or via the right-click menu.

  • type: boolean (True or False)
  • default: False
  • example: hide main menubar = True

15.5. Remote Job Management

Managing tasks in a workflow requires more than just job execution: Cylc performs additional actions with rsync for file transfer, and direct execution of cylc sub-commands over non-interactive SSH [4].

15.5.1. SSH-free Job Management?

Some sites may want to restrict access to job hosts by whitelisting SSH connections to allow only rsync for file transfer, and allowing job execution only via a local batch system that sees the job hosts [5] . We are investigating the feasibility of SSH-free job management when a local batch system is available, but this is not yet possible unless your suite and job hosts also share a filesystem, which allows Cylc to treat jobs as entirely local [6] .

15.5.2. SSH-based Job Management

Cylc does not have persistent agent processes running on job hosts to act on instructions received over the network [7] so instead we execute job management commands directly on job hosts over SSH. Reasons for this include:

  • it works equally for batch system and background jobs
  • SSH is required for background jobs, and for batch jobs if the batch system is not available on the suite host
  • querying the batch system alone is not sufficient for full job polling functionality because jobs can complete (and then be forgotten by the batch system) while the network, suite host, or suite server program is down (e.g. between suite shutdown and restart)
    • to handle this we get the automatic job wrapper code to write job messages and exit status to job status files that are interrogated by suite server programs during job polling operations
    • job status files reside on the job host, so the interrogation is done over SSH
  • job status files also hold batch system name and job ID; this is written by the job submit command, and read by job poll and kill commands (all over SSH)

15.5.3. A Concrete Example

The following suite, registered as suitex, is used to illustrate our current SSH-based remote job management. It submits two jobs to a remote, and a local task views a remote job log then polls and kills the remote jobs.

# suite.rc
[scheduling]
   [[dependencies]]
          graph = "delayer => master & REMOTES"
[runtime]
   [[REMOTES]]
      script = "sleep 30"
       [[[remote]]]
           host = wizard
           owner = hobo
   [[remote-a, remote-b]]
       inherit = REMOTES
   [[delayer]]
      script = "sleep 10"
   [[master]]
       script = """
 sleep 5
 cylc cat-log -m c -f o $CYLC_SUITE_NAME remote-a.1
 sleep 2
 cylc poll $CYLC_SUITE_NAME REMOTES.1
 sleep 2
 cylc kill $CYLC_SUITE_NAME REMOTES.1
 sleep 2
 cylc remove $CYLC_SUITE_NAME REMOTES.1"""

The delayer task just separates suite start-up from remote job submission, for clarity when watching the job host (e.g. with watch -n 1 find ~/cylc-run/suitex).

Global config specifies the path to the remote Cylc executable, says to retrieve job logs, and not to use a remote login shell:

# global.rc
[hosts]
   [[wizard]]
       cylc executable = /opt/bin/cylc
       retrieve job logs = True
       use login shell = False

On running the suite, remote job host actions were captured in the transcripts below by wrapping the ssh, scp, and rsync executables in scripts that log their command lines before taking action.

15.5.3.1. Create suite run directory and install source files

Done by rose suite-run before suite start-up (the command will be migrated to Cylc soon though).

  • with --new it invokes bash over SSH and a raw shell expression, to delete previous-run files
  • it invokes itself over SSH to create top level suite directories and install source files
    • skips installation if server UUID file is found on the job host (indicates a shared filesystem)
  • uses rsync for suite source file installation

Note

The same directory structure is used on suite and job hosts, for consistency and simplicity, and because the suite host can also be a job host.

# rose suite-run --new only: initial clean-out
ssh -oBatchMode=yes -oConnectTimeout=10 hobo@wizard bash -l -O extglob -c 'cd; echo '"'"'673d7a0d-7816-42a4-8132-4b1ab394349c'"'"'; ls -d -r cylc-run/suitex/work cylc-run/suitex/share/cycle cylc-run/suitex/share cylc-run/suitex; rm -fr cylc-run/suitex/work cylc-run/suitex/share/cycle cylc-run/suitex/share cylc-run/suitex; (cd ; rmdir -p cylc-run/suitex/work cylc-run/suitex/share/cycle cylc-run/suitex/share cylc-run 2>/dev/null || true)'

# rose suite-run: test for shared filesystem and create share/cycle directories
ssh -oBatchMode=yes -oConnectTimeout=10 -n hobo@wizard env ROSE_VERSION=2018.02.0 CYLC_VERSION=7.6.x bash -l -c '"$0" "$@"' rose suite-run -vv -n suitex --run=run --remote=uuid=231cd6a1-6d61-476d-96e1-4325ef9216fc,now-str=20180416T042319Z

# rose suite-run: install suite source directory to job host
rsync -a --exclude=.* --timeout=1800 --rsh=ssh -oBatchMode=yes -oConnectTimeout=10 --exclude=231cd6a1-6d61-476d-96e1-4325ef9216fc --exclude=log/231cd6a1-6d61-476d-96e1-4325ef9216fc --exclude=share/231cd6a1-6d61-476d-96e1-4325ef9216fc --exclude=share/cycle/231cd6a1-6d61-476d-96e1-4325ef9216fc --exclude=work/231cd6a1-6d61-476d-96e1-4325ef9216fc --exclude=/.* --exclude=/cylc-suite.db --exclude=/log --exclude=/log.* --exclude=/state --exclude=/share --exclude=/work ./ hobo@wizard:cylc-run/suitex
   # (internal rsync)
   ssh -oBatchMode=yes -oConnectTimeout=10 -l hobo wizard rsync --server -logDtpre.iLsfx --timeout=1800 . cylc-run/suitex
   # (internal rsync, back from hobo@wizard)
   rsync --server -logDtpre.iLsfx --timeout=1800 . cylc-run/suitex

Result:

 ~/cylc-run/suitex
|__log->log.20180418T025047Z  # LOG DIRECTORIES
|__log.20180418T025047Z  # log directory for current suite run
|__suiter.rc
|__xxx  # any suite source sub-dirs or file
|__work  # JOB WORK DIRECTORIES
|__share  #  SUITE SHARE DIRECTORY
   |__cycle
15.5.3.2. Server installs service directory
  • server address and credentials, so that clients such as cylc message executed by jobs can connect
  • done just before the first job is submitted to a remote, and at suite restart for the remotes of jobs running when the suite went down (server host, port, etc. may change at restart)
  • uses SSH to invoke cylc remote-init on job hosts. If the remote command does not find a server-side UUID file (which would indicate a shared filesystem) it reads a tar archive of the service directory from stdin, and unpacks it to install.
# cylc remote-init: install suite service directory
ssh -oBatchMode=yes -oConnectTimeout=10 hobo@wizard env CYLC_VERSION=7.6.x /opt/bin/cylc remote-init '066592b1-4525-48b5-b86e-da06eb2380d9' '$HOME/cylc-run/suitex'

Result:

 ~/cylc-run/suitex
|__.service  # SUITE SERVICE DIRECTORY
|  |__contact  # server address information
|  |__passphrase  # suite passphrase
|  |__ssl.cert  # suite SSL certificate
|__log->log.20180418T025047Z  # LOG DIRECTORIES
|__log.20180418T025047Z  # log directory for current suite run
|__suiter.rc
|__xxx  # any suite source sub-dirs or file
|__work  # JOB WORK DIRECTORIES
|__share  #  SUITE SHARE DIRECTORY
   |__cycle
15.5.3.3. Server submits jobs
  • done when tasks are ready to run, for multiple jobs at once
  • uses SSH to invoke cylc jobs-submit on the remote - to read job scripts from stdin, write them to disk, and submit them to run
# cylc jobs-submit: submit two jobs
ssh -oBatchMode=yes -oConnectTimeout=10 hobo@wizard env CYLC_VERSION=7.6.x /opt/bin/cylc jobs-submit '--remote-mode' '--' '$HOME/cylc-run/suitex/log/job' '1/remote-a/01' '1/remote-b/01'

Result:

 ~/cylc-run/suitex
|__.service  # SUITE SERVICE DIRECTORY
|  |__contact  # server address information
|  |__passphrase  # suite passphrase
|  |__ssl.cert  # suite SSL certificate
|__log->log.20180418T025047Z  # LOG DIRECTORIES
|__log.20180418T025047Z  # log directory for current suite run
|  |__ job  # job logs (to be distinguished from log/suite/ on the suite host)
|     |__1  # cycle point
|        |__remote-a  # task name
|        |  |__01  # job submit number
|        |  |  |__job  # job script
|        |  |  |__job.out  # job stdout
|        |  |  |__job.err  # job stderr
|        |  |  |__job.status  # job status
|        |  |__NN->0l  # symlink to latest submit number
|        |__remote-b  # task name
|           |__01  # job submit number
|           |  |__job  # job script
|           |  |__job.out  # job stdout
|           |  |__job.err  # job stderr
|           |  |__job.status  # job status
|           |__NN->0l  # symlink to latest submit number
|__suiter.rc
|__xxx  # any suite source sub-dirs or file
|__work  # JOB WORK DIRECTORIES
|  |__1  # cycle point
|     |__remote-a  # task name
|     |  |__xxx  # (any files written by job to PWD)
|     |__remote-b  # task name
|        |__xxx  # (any files written by job to PWD)
|__share  #  SUITE SHARE DIRECTORY
   |__cycle
   |__xxx  # (any job-created sub-dirs and files)
15.5.3.4. Server tracks job progress
  • jobs send messages back to the server program on the suite host
    • directly: client-server HTTPS over the network (requires service files installed - see above)
    • indirectly: re-invoke clients on the suite host (requires reverse SSH)
  • OR server polls jobs at intervals (requires job polling - see below)
15.5.3.5. User views job logs
  • command cylc cat-log via CLI or GUI, invokes itself over SSH to the remote
  • suites will serve job logs in future, but this will still be needed (e.g. if the suite is down)
# cylc cat-log: view a job log
ssh -oBatchMode=yes -oConnectTimeout=10 -n hobo@wizard env CYLC_VERSION=7.6.x /opt/bin/cylc cat-log --remote-arg='$HOME/cylc-run/suitex/log/job/1/remote-a/NN/job.out' --remote-arg=cat --remote-arg='tail -n +1 -F %(filename)s' suitex
15.5.3.6. Server cancels or kills jobs
  • done automatically or via user command cylc kill, for multiple jobs at once
  • uses SSH to invoke cylc jobs-kill on the remote, with job log paths on the command line. Reads job ID from the job status file.
# cylc jobs-kill: kill two jobs
ssh -oBatchMode=yes -oConnectTimeout=10 hobo@wizard env CYLC_VERSION=7.6.x /opt/bin/cylc jobs-kill '--' '$HOME/cylc-run/suitex/log/job' '1/remote-a/01' '1/remote-b/01'
15.5.3.7. Server polls jobs
  • done automatically or via user command cylc poll, for multiple jobs at once
  • uses SSH to invoke cylc jobs-poll on the remote, with job log paths on the command line. Reads job ID from the job status file.
# cylc jobs-poll: poll two jobs
ssh -oBatchMode=yes -oConnectTimeout=10 hobo@wizard env CYLC_VERSION=7.6.x /opt/bin/cylc jobs-poll '--' '$HOME/cylc-run/suitex/log/job' '1/remote-a/01' '1/remote-b/01'
15.5.3.8. Server retrieves jobs logs
  • done at job completion, according to global config
  • uses rsync
# rsync: retrieve two job logs
rsync -a --rsh=ssh -oBatchMode=yes -oConnectTimeout=10 --include=/1 --include=/1/remote-a --include=/1/remote-a/01 --include=/1/remote-a/01/** --include=/1/remote-b --include=/1/remote-b/01 --include=/1/remote-b/01/** --exclude=/** hobo@wizard:$HOME/cylc-run/suitex/log/job/ /home/vagrant/cylc-run/suitex/log/job/
   # (internal rsync)
   ssh -oBatchMode=yes -oConnectTimeout=10 -l hobo wizard rsync --server --sender -logDtpre.iLsfx . $HOME/cylc-run/suitex/log/job/
   # (internal rsync, back from hobo@wizard)
   rsync --server --sender -logDtpre.iLsfx . /home/hobo/cylc-run/suitex/log/job/
15.5.3.9. Server tidies job remote at shutdown
  • removes .service/contact so that clients won’t repeatedly try to connect
# cylc remote-tidy: remove the remote suite contact file
ssh -oBatchMode=yes -oConnectTimeout=10 hobo@wizard env CYLC_VERSION=7.6.x /opt/bin/cylc remote-tidy '$HOME/cylc-run/suitex'

15.5.4. Other Use of SSH in Cylc

  • see if a suite is running on another host with a shared filesystem - see detect_old_contact_file() in lib/cylc/suite_srv_files_mgr.py
  • cat content of a remote service file over SSH, if possible, for clients on that do not have suite credentials installed - see _load_remote_item() in suite_srv_files_mgr.py
[4]Cylc used to run bare shell expressions over SSH, which required a bash shell and made whitelisting difficult.
[5]A malicious script could be rsync’d and run from a batch job, but batch jobs are considered easier to audit.
[6]The job ID must also be valid to query and kill the job via the local batch system. This is not the case for Slurm, unless the --cluster option is explicitly used in job query and kill commands, otherwise the job ID is not recognized by the local Slurm instance.
[7]This would be a more complex solution, in terms of implementation, administration, and security.

15.6. Command Reference

15.6.1. Help

   Cylc ("silk") is a workflow engine for orchestrating complex
*suites* of inter-dependent distributed cycling (repeating) tasks, as well as
ordinary non-cycling workflows.
For detailed documentation see the Cylc User Guide (cylc doc --help).

Version 7.9.3

The graphical user interface for cylc is "gcylc" (a.k.a. "cylc gui").

USAGE:
  % cylc -V,--version,version           # print cylc version
  % cylc version --long                 # print cylc version and path
  % cylc help,--help,-h,?               # print this help page

  % cylc help CATEGORY                  # print help by category
  % cylc CATEGORY help                  # (ditto)
  % cylc help [CATEGORY] COMMAND        # print command help
  % cylc [CATEGORY] COMMAND --help      # (ditto)
  % cylc COMMAND --help                 # (ditto)

  % cylc COMMAND [options] SUITE [arguments]
  % cylc COMMAND [options] SUITE TASK [arguments]

Commands can be abbreviated as long as there is no ambiguity in
the abbreviated command:

  % cylc trigger SUITE TASK             # trigger TASK in SUITE
  % cylc trig SUITE TASK                # ditto
  % cylc tr SUITE TASK                  # ditto

  % cylc get                            # Error: ambiguous command

TASK IDENTIFICATION IN CYLC SUITES
  Tasks are identified by NAME.CYCLE_POINT where POINT is either a
  date-time or an integer.
  Date-time cycle points are in an ISO 8601 date-time format, typically
  CCYYMMDDThhmm followed by a time zone - e.g. 20101225T0600Z.
  Integer cycle points (including those for one-off suites) are integers
  - just '1' for one-off suites.

HOW TO DRILL DOWN TO COMMAND USAGE HELP:
  % cylc help           # list all available categories (this page)
  % cylc help prep      # list commands in category 'preparation'
  % cylc help prep edit # command usage help for 'cylc [prep] edit'

Command CATEGORIES:
  control ....... Suite start up, monitoring, and control.
  information ... Interrogate suite definitions and running suites.
  all ........... The complete command set.
  task .......... The task messaging interface.
  license|GPL ... Software licensing information (GPL v3.0).
  admin ......... Cylc installation, testing, and example suites.
  preparation ... Suite editing, validation, visualization, etc.
  hook .......... Suite and task event hook scripts.
  discovery ..... Detect running suites.
  utility ....... Cycle arithmetic and templating, etc.

15.6.2. Command Categories

15.6.2.1. admin
   CATEGORY: admin - Cylc installation, testing, and example suites.

HELP: cylc [admin] COMMAND help,--help
  You can abbreviate admin and COMMAND.
  The category admin may be omitted.

COMMANDS:
  check-software .... Check required software is installed
  import-examples ... Import example suites your suite run directory
  profile-battery ... Run a battery of profiling tests
  test-battery ...... Run a battery of self-diagnosing test suites
  upgrade-run-dir ... Upgrade a pre-cylc-6 suite run directory
15.6.2.2. all
   CATEGORY: all - The complete command set.

HELP: cylc [all] COMMAND help,--help
  You can abbreviate all and COMMAND.
  The category all may be omitted.

COMMANDS:
  5to6 ........................................ Improve the cylc 6 compatibility of a cylc 5 suite file
  broadcast|bcast ............................. Change suite [runtime] settings on the fly
  cat-log|log ................................. Print various suite and task log files
  cat-state ................................... Print the state of tasks from the state dump
  check-software .............................. Check required software is installed
  check-triggering ............................ A suite shutdown event hook for cylc testing
  check-versions .............................. Compare cylc versions on task host accounts
  checkpoint .................................. Tell suite to checkpoint its current state
  client ...................................... (Internal) Invoke HTTP(S) client, expect JSON input
  conditions .................................. Print the GNU General Public License v3.0
  cycle-point|cyclepoint|datetime|cycletime ... Cycle point arithmetic and filename templating
  diff|compare ................................ Compare two suite definitions and print differences
  documentation|browse ........................ Display cylc documentation (User Guide etc.)
  dump ........................................ Print the state of tasks in a running suite
  edit ........................................ Edit suite definitions, optionally inlined
  email-suite ................................. A suite event hook script that sends email alerts
  email-task .................................. A task event hook script that sends email alerts
  ext-trigger|external-trigger ................ Report an external trigger event to a suite
  function-run ................................ (Internal) Run a function in the process pool
  get-directory ............................... Retrieve suite source directory paths
  get-gui-config .............................. Print gcylc configuration items
  get-host-metrics ............................ Print localhost metric data
  get-site-config|get-global-config ........... Print site/user configuration items
  get-suite-config|get-config ................. Print suite configuration items
  get-suite-contact|get-contact ............... Print contact information of a suite server program
  get-suite-version|get-cylc-version .......... Print cylc version of a suite server program
  gpanel ...................................... Internal interface for GNOME 2 panel applet
  graph ....................................... Plot suite dependency graphs and runtime hierarchies
  graph-diff .................................. Compare two suite dependencies or runtime hierarchies
  gscan|gsummary .............................. Scan GUI for monitoring multiple suites
  gui ......................................... (a.k.a. gcylc) cylc GUI for suite control etc.
  hold ........................................ Hold (pause) suites or individual tasks
  import-examples ............................. Import example suites your suite run directory
  insert ...................................... Insert tasks into a running suite
  jobs-kill ................................... (Internal) Kill task jobs
  jobs-poll ................................... (Internal) Retrieve status for task jobs
  jobs-submit ................................. (Internal) Submit task jobs
  jobscript ................................... Generate a task job script and print it to stdout
  kill ........................................ Kill submitted or running tasks
  list|ls ..................................... List suite tasks and family namespaces
  ls-checkpoints .............................. Display task pool etc at given events
  message|task-message ........................ Report task messages
  monitor ..................................... An in-terminal suite monitor (see also gcylc)
  nudge ....................................... Cause the cylc task processing loop to be invoked
  ping ........................................ Check that a suite is running
  poll ........................................ Poll submitted or running tasks
  print ....................................... Print registered suites
  profile-battery ............................. Run a battery of profiling tests
  ref-graph ................................... Print text-format "reference graphs" without GTK
  register .................................... Register a suite for use
  release|unhold .............................. Release (unpause) suites or individual tasks
  reload ...................................... Reload the suite definition at run time
  remote-init ................................. (Internal) Initialise a task remote
  remote-tidy ................................. (Internal) Tidy a task remote
  remove ...................................... Remove tasks from a running suite
  report-timings .............................. Generate a report on task timing data
  reset ....................................... Force one or more tasks to change state
  restart ..................................... Restart a suite from a previous state
  review ...................................... Start/stop ad-hoc Cylc Review web service server.
  run|start ................................... Start a suite at a given cycle point
  scan ........................................ Scan a host for running suites
  scp-transfer ................................ Scp-based file transfer for cylc suites
  search|grep ................................. Search in suite definitions
  set-verbosity ............................... Change a running suite's logging verbosity
  show ........................................ Print task state (prerequisites and outputs etc.)
  spawn ....................................... Force one or more tasks to spawn their successors
  stop|shutdown ............................... Shut down running suites
  submit|single ............................... Run a single task just as its parent suite would
  suite-state ................................. Query the task states in a suite
  test-battery ................................ Run a battery of self-diagnosing test suites
  trigger ..................................... Manually trigger or re-trigger a task
  upgrade-run-dir ............................. Upgrade a pre-cylc-6 suite run directory
  validate .................................... Parse and validate suite definitions
  view ........................................ View suite definitions, inlined and Jinja2 processed
  warranty .................................... Print the GPLv3 disclaimer of warranty
15.6.2.3. control
   CATEGORY: control - Suite start up, monitoring, and control.

HELP: cylc [control] COMMAND help,--help
  You can abbreviate control and COMMAND.
  The category control may be omitted.

COMMANDS:
  broadcast|bcast ................ Change suite [runtime] settings on the fly
  checkpoint ..................... Tell suite to checkpoint its current state
  client ......................... (Internal) Invoke HTTP(S) client, expect JSON input
  ext-trigger|external-trigger ... Report an external trigger event to a suite
  gui ............................ (a.k.a. gcylc) cylc GUI for suite control etc.
  hold ........................... Hold (pause) suites or individual tasks
  insert ......................... Insert tasks into a running suite
  kill ........................... Kill submitted or running tasks
  nudge .......................... Cause the cylc task processing loop to be invoked
  poll ........................... Poll submitted or running tasks
  release|unhold ................. Release (unpause) suites or individual tasks
  reload ......................... Reload the suite definition at run time
  remove ......................... Remove tasks from a running suite
  reset .......................... Force one or more tasks to change state
  restart ........................ Restart a suite from a previous state
  run|start ...................... Start a suite at a given cycle point
  set-verbosity .................. Change a running suite's logging verbosity
  spawn .......................... Force one or more tasks to spawn their successors
  stop|shutdown .................. Shut down running suites
  trigger ........................ Manually trigger or re-trigger a task
15.6.2.4. discovery
   CATEGORY: discovery - Detect running suites.

HELP: cylc [discovery] COMMAND help,--help
  You can abbreviate discovery and COMMAND.
  The category discovery may be omitted.

COMMANDS:
  check-versions ... Compare cylc versions on task host accounts
  ping ............. Check that a suite is running
  scan ............. Scan a host for running suites
15.6.2.5. hook
   CATEGORY: hook - Suite and task event hook scripts.

HELP: cylc [hook] COMMAND help,--help
  You can abbreviate hook and COMMAND.
  The category hook may be omitted.

COMMANDS:
  check-triggering ... A suite shutdown event hook for cylc testing
  email-suite ........ A suite event hook script that sends email alerts
  email-task ......... A task event hook script that sends email alerts
15.6.2.6. information
   CATEGORY: information - Interrogate suite definitions and running suites.

HELP: cylc [information] COMMAND help,--help
  You can abbreviate information and COMMAND.
  The category information may be omitted.

COMMANDS:
  cat-log|log .......................... Print various suite and task log files
  cat-state ............................ Print the state of tasks from the state dump
  documentation|browse ................. Display cylc documentation (User Guide etc.)
  dump ................................. Print the state of tasks in a running suite
  get-gui-config ....................... Print gcylc configuration items
  get-host-metrics ..................... Print localhost metric data
  get-site-config|get-global-config .... Print site/user configuration items
  get-suite-config|get-config .......... Print suite configuration items
  get-suite-contact|get-contact ........ Print contact information of a suite server program
  get-suite-version|get-cylc-version ... Print cylc version of a suite server program
  gpanel ............................... Internal interface for GNOME 2 panel applet
  gscan|gsummary ....................... Scan GUI for monitoring multiple suites
  gui|gcylc ............................ (a.k.a. gcylc) cylc GUI for suite control etc.
  list|ls .............................. List suite tasks and family namespaces
  monitor .............................. An in-terminal suite monitor (see also gcylc)
  review ............................... Start/stop ad-hoc Cylc Review web service server.
  show ................................. Print task state (prerequisites and outputs etc.)
15.6.2.7. license
   CATEGORY: license|GPL - Software licensing information (GPL v3.0).

HELP: cylc [license|GPL] COMMAND help,--help
  You can abbreviate license|GPL and COMMAND.
  The category license|GPL may be omitted.

COMMANDS:
  conditions ... Print the GNU General Public License v3.0
  warranty ..... Print the GPLv3 disclaimer of warranty
15.6.2.8. preparation
   CATEGORY: preparation - Suite editing, validation, visualization, etc.

HELP: cylc [preparation] COMMAND help,--help
  You can abbreviate preparation and COMMAND.
  The category preparation may be omitted.

COMMANDS:
  5to6 ............ Improve the cylc 6 compatibility of a cylc 5 suite file
  diff|compare .... Compare two suite definitions and print differences
  edit ............ Edit suite definitions, optionally inlined
  get-directory ... Retrieve suite source directory paths
  graph ........... Plot suite dependency graphs and runtime hierarchies
  graph-diff ...... Compare two suite dependencies or runtime hierarchies
  jobscript ....... Generate a task job script and print it to stdout
  list|ls ......... List suite tasks and family namespaces
  print ........... Print registered suites
  ref-graph ....... Print text-format "reference graphs" without GTK
  register ........ Register a suite for use
  search|grep ..... Search in suite definitions
  validate ........ Parse and validate suite definitions
  view ............ View suite definitions, inlined and Jinja2 processed
15.6.2.9. task
   CATEGORY: task - The task messaging interface.

HELP: cylc [task] COMMAND help,--help
  You can abbreviate task and COMMAND.
  The category task may be omitted.

COMMANDS:
  jobs-kill .............. (Internal) Kill task jobs
  jobs-poll .............. (Internal) Retrieve status for task jobs
  jobs-submit ............ (Internal) Submit task jobs
  message|task-message ... Report task messages
  remote-init ............ (Internal) Initialise a task remote
  remote-tidy ............ (Internal) Tidy a task remote
  submit|single .......... Run a single task just as its parent suite would
15.6.2.10. utility
   CATEGORY: utility - Cycle arithmetic and templating, etc.

HELP: cylc [utility] COMMAND help,--help
  You can abbreviate utility and COMMAND.
  The category utility may be omitted.

COMMANDS:
  cycle-point|cyclepoint|datetime|cycletime ... Cycle point arithmetic and filename templating
  function-run ................................ (Internal) Run a function in the process pool
  ls-checkpoints .............................. Display task pool etc at given events
  report-timings .............................. Generate a report on task timing data
  scp-transfer ................................ Scp-based file transfer for cylc suites
  suite-state ................................. Query the task states in a suite

15.6.3. Commands

15.6.3.1. 5to6
   Usage: cylc [prep] 5to6 FILE

Suggest changes to a cylc 5 suite file to make it more cylc 6 compatible.
This may be a suite.rc file, an include file, or a suite.rc.processed file.

By default, print the changed file to stdout. Lines that have been changed
are marked with '# UPGRADE'. These marker comments are purely for your own
information and should not be included in any changes you make. In
particular, they may break continuation lines.

Lines with '# UPGRADE CHANGE' have been altered.
Lines with '# UPGRADE ... INFO' indicate that manual change is needed.

As of cylc 7, 'cylc validate' will no longer print out automatic dependency
section translations. At cylc 6 versions of cylc, 'cylc validate' will show
start-up/mixed async replacement R1* section(s). The validity of these can
be highly dependent on the initial cycle point choice (e.g. whether it is
T00 or T12).

This command works best for hour-based cycling - it will always convert
e.g. 'foo[T-6]' to 'foo[-PT6H]', even where this is in a monthly or yearly
cycling section graph.

This command is an aid, and is not an auto-upgrader or a substitute for
reading the documentation. The suggested changes must be understood and
checked by hand.

Example usage:

# Print out a file path (FILE) with suggested changes to stdout.
cylc 5to6 FILE

# Replace the file with the suggested changes file.
cylc 5to6 FILE > FILE

# Save a copy of the changed file.
cylc 5to6 FILE > FILE.5to6

# Show the diff of the changed file vs the original file.
diff - <(cylc 5to6 FILE) <FILE

Options:
  -h, --help   Print this help message and exit.
15.6.3.2. broadcast
   Usage: cylc [control] broadcast|bcast [OPTIONS] REG

Override [runtime] config in targeted namespaces in a running suite.

Uses for broadcast include making temporary changes to task behaviour,
and task-to-downstream-task communication via environment variables.

A broadcast can target any [runtime] namespace for all cycles or for a
specific cycle.  If a task is affected by specific-cycle and all-cycle
broadcasts at once, the specific takes precedence. If a task is affected
by broadcasts to multiple ancestor namespaces, the result is determined
by normal [runtime] inheritance. In other words, it follows this order:

all:root -> all:FAM -> all:task -> tag:root -> tag:FAM -> tag:task

Broadcasts persist, even across suite restarts, until they expire when
their target cycle point is older than the oldest current in the suite,
or until they are explicitly cancelled with this command.  All-cycle
broadcasts do not expire.

For each task the final effect of all broadcasts to all namespaces is
computed on the fly just prior to job submission.  The --cancel and
--clear options simply cancel (remove) active broadcasts, they do not
act directly on the final task-level result. Consequently, for example,
you cannot broadcast to "all cycles except Tn" with an all-cycle
broadcast followed by a cancel to Tn (there is no direct broadcast to Tn
to cancel); and you cannot broadcast to "all members of FAMILY except
member_n" with a general broadcast to FAMILY followed by a cancel to
member_n (there is no direct broadcast to member_n to cancel).

To broadcast a variable to all tasks (quote items with internal spaces):
  % cylc broadcast -s "[environment]VERSE = the quick brown fox" REG
To do the same with a file:
  % cat >'broadcast.rc' <<'__RC__'
  % [environment]
  %     VERSE = the quick brown fox
  % __RC__
  % cylc broadcast -F 'broadcast.rc' REG
To cancel the same broadcast:
  % cylc broadcast --cancel "[environment]VERSE" REG
If -F FILE was used, the same file can be used to cancel the broadcast:
  % cylc broadcast -G 'broadcast.rc' REG

Use -d/--display to see active broadcasts. Multiple --cancel options or
multiple --set and --set-file options can be used on the same command line.
Multiple --set and --set-file options are cumulative.

The --set-file=FILE option can be used when broadcasting multiple values, or
when the value contains newline or other metacharacters. If FILE is "-", read
from standard input.

Broadcast cannot change [runtime] inheritance.

See also 'cylc reload' - reload a modified suite definition at run time.

Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  -p CYCLE_POINT, --point=CYCLE_POINT
                        Target cycle point. More than one can be added.
                        Defaults to '*' with --set and --cancel, and nothing
                        with --clear.
  -n NAME, --namespace=NAME
                        Target namespace. Defaults to 'root' with --set and
                        --cancel, and nothing with --clear.
  -s [SEC]ITEM=VALUE, --set=[SEC]ITEM=VALUE
                        A [runtime] config item and value to broadcast.
  -F FILE, --set-file=FILE, --file=FILE
                        File with config to broadcast. Can be used multiple
                        times.
  -c [SEC]ITEM, --cancel=[SEC]ITEM
                        An item-specific broadcast to cancel.
  -G FILE, --cancel-file=FILE
                        File with broadcasts to cancel. Can be used multiple
                        times.
  -C, --clear           Cancel all broadcasts, or with -p/--point,
                        -n/--namespace, cancel all broadcasts to targeted
                        namespaces and/or cycle points. Use "-C -p '*'" to
                        cancel all all-cycle broadcasts without canceling all
                        specific-cycle broadcasts.
  -e CYCLE_POINT, --expire=CYCLE_POINT
                        Cancel any broadcasts that target cycle points earlier
                        than, but not inclusive of, CYCLE_POINT.
  -d, --display         Display active broadcasts.
  -k TASKID, --display-task=TASKID
                        Print active broadcasts for a given task
                        (NAME.CYCLE_POINT).
  -b, --box             Use unicode box characters with -d, -k.
  -r, --raw             With -d/--display or -k/--display-task, write out the
                        broadcast config structure in raw Python form.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.3. cat-log
   Usage: cylc [info] cat-log|log [OPTIONS] REG [TASK-ID]

Print, view-in-editor, or tail-follow content, print path, or list directory,
of local or remote task job and suite server logs. Batch-system view commands
(e.g. 'qcat') are used if defined in global config and the job is running.

For standard log types use the short-cut option argument or full filename (e.g.
for job stdout "-f o" or "-f job.out" will do).

To list the local job log directory of a remote task, choose "-m l" (directory
list mode) and a local file, e.g. "-f a" (job-activity.log).

If remote job logs are retrieved to the suite host on completion (global config
'[JOB-HOST]retrieve job logs = True') and the job is not currently running, the
local (retrieved) log will be accessed unless '-o/--force-remote' is used.

Custom job logs (written to $CYLC_TASK_LOG_DIR on the job host) are available
from the GUI if listed in 'extra log files' in the suite definition. The file
name must be given here, but can be discovered with '--mode=l' (list-dir).

The correct cycle point format of the suite must be for task job logs.

Note the --host/user options are not needed to view remote job logs. They are
the general command reinvocation options for sites using ssh-based task
messaging.

Arguments:
   REG                     Suite name
   [TASK-ID]               Task ID

Options:
  -h, --help            show this help message and exit
  -f LOG, --file=LOG      Job log: a(job-activity.log), e(job.err), d(job-
                        edit.diff), j(job), o(job.out), s(job.status),
                        x(job.xtrace); default o(out).  Or <filename> for
                        custom (and standard) job logs.
  -m MODE, --mode=MODE  Mode: c(cat), e(edit), d(print-dir), l(list-dir),
                        p(print), t(tail). Default c(cat).
  -r INT, --rotation=INT
                        Suite log integer rotation number. 0 for current, 1
                        for next oldest, etc.
  -o, --force-remote    View remote logs remotely even if they have been
                        retrieved to the suite host (default False).
  -s INT, -t INT, --submit-number=INT, --try-number=INT
                        Job submit number (default=NN, i.e. latest).
  -g, --geditor         edit mode: use your configured GUI editor.
  --remote-arg=REMOTE_ARGS
                        (for internal use: continue processing on job host)
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.4. cat-state
   Usage: cylc [info] cat-state [OPTIONS] REG

Print the suite state in the old state dump file format to stdout.
This command is deprecated; use "cylc ls-checkpoints" instead.

Arguments:
   REG               Suite name

Options:
  -h, --help     show this help message and exit
  -d, --dump     Use the same display format as the 'cylc dump' command.
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.5. check-software
   cylc [admin] check-software [MODULES]

Check for Cylc external software dependencices, including minimum versions.

With no arguments, prints a table of results for all core & optional external
module requirements, grouped by functionality. With module argument(s),
provides an exit status for the collective result of checks on those modules.

Arguments:
    [MODULES]   Modules to include in the software check, which returns a
                zero ('pass') or non-zero ('fail') exit status, where the
                integer is equivalent to the number of modules failing. Run
                the bare check-software command to view the full list of
                valid module arguments (lower-case equivalents accepted).
15.6.3.6. check-triggering
   cylc [hook] check-triggering ARGS

This is a cylc shutdown event handler that compares the newly generated
suite log with a previously generated reference log "reference.log"
stored in the suite definition directory. Currently it just compares
runtime triggering information, disregarding event order and timing, and
fails the suite if there is any difference. This should be sufficient to
verify correct scheduling of any suite that is not affected by different
run-to-run conditional triggering.

1) run your suite with "cylc run --generate-reference-log" to generate
the reference log with resolved triggering information. Check manually
that the reference run was correct.
2) run reference tests with "cylc run --reference-test" - this
automatically sets the shutdown event handler along with a suite timeout
and "abort if shutdown handler fails", "abort on timeout", and "abort if
any task fails".

Reference tests can use any run mode:
 * simulation mode - tests that scheduling is equivalent to the reference
 * dummy mode - also tests that task hosting, job submission, job script
   evaluation, and cylc messaging are not broken.
 * live mode - tests everything (but takes longer with real tasks!)

 If any task fails, or if cylc itself fails, or if triggering is not
 equivalent to the reference run, the test will abort with non-zero exit
 status - so reference tests can be used as automated tests to check
 that changes to cylc have not broken your suites.
15.6.3.7. check-versions
   Usage: cylc [discovery] check-versions [OPTIONS] SUITE

Check the version of cylc invoked on each of SUITE's task host accounts when
CYLC_VERSION is set to *the version running this command line tool*.
Different versions are reported but are not considered an error unless the
-e|--error option is specified, because different cylc versions from 6.0.0
onward should at least be backward compatible.

It is recommended that cylc versions be installed in parallel and access
configured via the cylc version wrapper as described in the cylc INSTALL
file and User Guide. This must be done on suite and task hosts. Users then get
the latest installed version by default, or (like tasks) a particular version
if $CYLC_VERSION is defined.

Use -v/--verbose to see the command invoked to determine the remote version
(all remote cylc command invocations will be of the same form, which may be
site dependent -- see cylc global config documentation.

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help            show this help message and exit
  -e, --error           Exit with error status if 7.9.3 is not available on
                        all remote accounts.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
15.6.3.8. checkpoint
   Usage: cylc [control] checkpoint [OPTIONS] REG CHECKPOINT-NAME

Tell suite to checkpoint its current state.


Arguments:
   REG                           Suite name
   CHECKPOINT-NAME               Checkpoint name

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.9. client
   Usage: cylc client [OPTIONS] METHOD [REG]

(This command is for internal use.)
Invoke HTTP(S) client, expect JSON from STDIN for keyword arguments.
Use the -n option if client function requires no keyword arguments.


Arguments:
   METHOD               Network API function name
   [REG]                Suite name

Options:
  -h, --help            show this help message and exit
  -n, --no-input        Do not read from STDIN, assume null input
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.10. conditions
   Usage: cylc [license] warranty [--help]
Cylc is release under the GNU General Public License v3.0
This command prints the GPL v3.0 license in full.

Options:
  --help   Print this usage message.
15.6.3.11. cycle-point
   Usage: cylc [util] cycle-point [OPTIONS] [POINT]

Cycle point date-time offset computation, and filename templating.

Filename templating replaces elements of a template string with corresponding
elements of the current or given cycle point.

Use ISO 8601 or posix date-time format elements:
  % cylc cyclepoint 2010080T00 --template foo-CCYY-MM-DD-Thh.nc
  foo-2010-08-08-T00.nc
  % cylc cyclepoint 2010080T00 --template foo-%Y-%m-%d-T%H.nc
  foo-2010-08-08-T00.nc

Other examples:

1) print offset from an explicit cycle point:
  % cylc [util] cycle-point --offset-hours=6 20100823T1800Z
  20100824T0000Z

2) print offset from $CYLC_TASK_CYCLE_POINT (as in suite tasks):
  % export CYLC_TASK_CYCLE_POINT=20100823T1800Z
  % cylc cycle-point --offset-hours=-6
  20100823T1200Z

3) cycle point filename templating, explicit template:
  % export CYLC_TASK_CYCLE_POINT=2010-08
  % cylc cycle-point --offset-years=2 --template=foo-CCYY-MM.nc
  foo-2012-08.nc

4) cycle point filename templating, template in a variable:
  % export CYLC_TASK_CYCLE_POINT=2010-08
  % export MYTEMPLATE=foo-CCYY-MM.nc
  % cylc cycle-point --offset-years=2 --template=MYTEMPLATE
  foo-2012-08.nc

Arguments:
   [POINT]               ISO8601 date-time, default=$CYLC_TASK_CYCLE_POINT

Options:
  -h, --help            show this help message and exit
  --offset-hours=HOURS  Add N hours to CYCLE (may be negative)
  --offset-days=DAYS    Add N days to CYCLE (N may be negative)
  --offset-months=MONTHS
                        Add N months to CYCLE (N may be negative)
  --offset-years=YEARS  Add N years to CYCLE (N may be negative)
  --offset=ISO_OFFSET   Add an ISO 8601-based interval representation to CYCLE
  --equal=POINT2        Succeed if POINT2 is equal to POINT (format agnostic).
  --template=TEMPLATE   Filename template string or variable
  --time-zone=TEMPLATE  Control the formatting of the result's timezone e.g.
                        (Z, +13:00, -hh
  --num-expanded-year-digits=NUMBER
                        Specify a number of expanded year digits to print in
                        the result
  --print-year          Print only CCYY of result
  --print-month         Print only MM of result
  --print-day           Print only DD of result
  --print-hour          Print only hh of result
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.12. diff
   Usage: cylc [prep] diff|compare [OPTIONS] SUITE1 SUITE2

Compare two suite definitions and display any differences.

Differencing is done after parsing the suite.rc files so it takes
account of default values that are not explicitly defined, it disregards
the order of configuration items, and it sees any include-file content
after inlining has occurred.

Files in the suite bin directory and other sub-directories of the
suite definition directory are not currently differenced.

Arguments:
   SUITE1               Suite name or path
   SUITE2               Suite name or path

Options:
  -h, --help            show this help message and exit
  -n, --nested          print suite.rc section headings in nested form.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
  --initial-cycle-point=CYCLE_POINT, --initial-point=CYCLE_POINT, --icp=CYCLE_POINT, --ict=CYCLE_POINT
                        Set the initial cycle point. Required if not defined
                        in suite.rc.
15.6.3.13. documentation
   Usage: cylc [info] documentation|browse [OPTIONS] [TARGET]

View documentation in the browser, as per Cylc global config.

% cylc doc [OPTIONS]
   View local or internet [--www] Cylc documentation URLs.

% cylc doc [-t TASK] SUITE
    View suite or task documentation, if URLs are specified in the suite. This
    parses the suite definition to extract the requested URL. Note that suite
    server programs also hold suite URLs for access from the Cylc GUI.

Arguments:
   [TARGET]               File or suite name

Options:
  -h, --help            show this help message and exit
  -g, --guides          Open the HTML (User & Suite Design) Guides directly.
  -w, --www             Open the cylc internet homepage
  -t TASK_NAME, --task=TASK_NAME
                        Browse task documentation URLs.
  -s, --stdout          Just print the URL to stdout.
  --debug               Print exception traceback on error.
  --url=URL             URL to view in your configured browser.
  -v, --verbose         Verbose output mode.
15.6.3.14. dump
   Usage: cylc [info] dump [OPTIONS] REG

Print state information (e.g. the state of each task) from a running
suite. For small suites 'watch cylc [info] dump SUITE' is an effective
non-GUI real time monitor (but see also 'cylc monitor').

For more information about a specific task, such as the current state of
its prerequisites and outputs, see 'cylc [info] show'.

Examples:
 Display the state of all running tasks, sorted by cycle point:
 % cylc [info] dump --tasks --sort SUITE | grep running

 Display the state of all tasks in a particular cycle point:
 % cylc [info] dump -t SUITE | grep 2010082406

Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  -g, --global          Global information only.
  -t, --tasks           Task states only.
  -r, --raw, --raw-format
                        Display raw format.
  -s, --sort            Task states only; sort by cycle point instead of name.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
15.6.3.15. edit
   Usage: cylc [prep] edit [OPTIONS] SUITE

Edit suite definitions without having to move to their directory
locations, and with optional reversible inlining of include-files. Note
that Jinja2 suites can only be edited in raw form but the processed
version can be viewed with 'cylc [prep] view -p'.

1/cylc [prep] edit SUITE
Change to the suite definition directory and edit the suite.rc file.

2/ cylc [prep] edit -i,--inline SUITE
Edit the suite with include-files inlined between special markers. The
original suite.rc file is temporarily replaced so that the inlined
version is "live" during editing (i.e. you can run suites during
editing and cylc will pick up changes to the suite definition). The
inlined file is then split into its constituent include-files
again when you exit the editor. Include-files can be nested or
multiply-included; in the latter case only the first inclusion is
inlined (this prevents conflicting changes made to the same file).

3/ cylc [prep] edit --cleanup SUITE
Remove backup files left by previous INLINED edit sessions.

INLINED EDITING SAFETY: The suite.rc file and its include-files are
automatically backed up prior to an inlined editing session. If the
editor dies mid-session just invoke 'cylc edit -i' again to recover from
the last saved inlined file. On exiting the editor, if any of the
original include-files are found to have changed due to external
intervention during editing you will be warned and the affected files
will be written to new backups instead of overwriting the originals.
Finally, the inlined suite.rc file is also backed up on exiting
the editor, to allow recovery in case of accidental corruption of the
include-file boundary markers in the inlined file.

The edit process is spawned in the foreground as follows:
  % <editor> suite.rc
Where <editor> is defined in the cylc site/user config files.

See also 'cylc [prep] view'.

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help           show this help message and exit
  -i, --inline         Edit with include-files inlined as described above.
  --cleanup            Remove backup files left by previous inlined edit
                       sessions.
  -g, --gui            Force use of the configured GUI editor.
  --user=USER          Other user account name. This results in command
                       reinvocation on the remote account.
  --host=HOST          Other host name. This results in command reinvocation
                       on the remote account.
  -v, --verbose        Verbose output mode.
  --debug              Output developer information and show exception
                       tracebacks.
  --suite-owner=OWNER  Specify suite owner
15.6.3.16. email-suite
   Usage: cylc [hook] email-suite EVENT SUITE MESSAGE

THIS COMMAND IS OBSOLETE - use built-in email event hooks.

This is a simple suite event hook script that sends an email.
The command line arguments are supplied automatically by cylc.

For example, to get an email alert when a suite shuts down:

# SUITE.RC
[cylc]
   [[environment]]
      MAIL_ADDRESS = foo@bar.baz.waz
   [[events]]
      shutdown handler = cylc email-suite

See the Suite.rc Reference (Cylc User Guide) for more information
on suite and task event hooks and event handler scripts.
15.6.3.17. email-task
   Usage: cylc [hook] email-task EVENT SUITE TASKID MESSAGE

THIS COMMAND IS OBSOLETE - use built-in email event hooks.

A simple task event hook handler script that sends an email.
The command line arguments are supplied automatically by cylc.

For example, to get an email alert whenever any task fails:

# SUITE.RC
[cylc]
   [[environment]]
      MAIL_ADDRESS = foo@bar.baz.waz
[runtime]
   [[root]]
      [[[events]]]
         failed handler = cylc email-task

See the Suite.rc Reference (Cylc User Guide) for more information
on suite and task event hooks and event handler scripts.
15.6.3.18. ext-trigger
   Usage: cylc [control] ext-trigger [OPTIONS] REG MSG ID

Report an external event message to a suite server program. It is expected that
a task in the suite has registered the same message as an external trigger - a
special prerequisite to be satisifed by an external system, via this command,
rather than by triggering off other tasks.

The ID argument should uniquely distinguish one external trigger event from the
next. When a task's external trigger is satisfied by an incoming message, the
message ID is broadcast to all downstream tasks in the cycle point as
$CYLC_EXT_TRIGGER_ID so that they can use it - e.g. to identify a new data file
that the external triggering system is responding to.

Use the retry options in case the target suite is down or out of contact.

The suite passphrase must be installed in $HOME/.cylc/<SUITE>/.

Note: to manually trigger a task use 'cylc trigger', not this command.

Arguments:
   REG               Suite name
   MSG               External trigger message
   ID                Unique trigger ID

Options:
  -h, --help            show this help message and exit
  --max-tries=INT       Maximum number of send attempts (default 5).
  --retry-interval=SEC  Delay in seconds before retrying (default 10.0).
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.19. function-run
   USAGE: cylc function-run <name> <json-args> <json-kwargs> <src-dir>

INTERNAL USE (asynchronous external trigger function execution)

Run a Python function "<name>(*args, **kwargs)" in the process pool. It must be
defined in a module of the same name. Positional and keyword arguments must be
passed in as JSON strings. <src-dir> is the suite source dir, needed to find
local xtrigger modules.
15.6.3.20. get-directory
   Usage: cylc [prep] get-directory REG

Retrieve and print the source directory location of suite REG.
Here's an easy way to move to a suite source directory:
  $ cd $(cylc get-dir REG).

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help           show this help message and exit
  --user=USER          Other user account name. This results in command
                       reinvocation on the remote account.
  --host=HOST          Other host name. This results in command reinvocation
                       on the remote account.
  -v, --verbose        Verbose output mode.
  --debug              Output developer information and show exception
                       tracebacks.
  --suite-owner=OWNER  Specify suite owner
15.6.3.21. get-gui-config
   Usage: cylc [admin] get-gui-config [OPTIONS]

Print gcylc configuration settings.

By default all settings are printed. For specific sections or items
use -i/--item and wrap parent sections in square brackets:
   cylc get-gui-config --item '[themes][default]succeeded'
Multiple items can be specified at once.

Options:
  -h, --help            show this help message and exit
  -i [SEC...]ITEM, --item=[SEC...]ITEM
                        Item or section to print (multiple use allowed).
  --sparse              Only print items explicitly set in the config files.
  -p, --python          Print native Python format.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.22. get-host-metrics
   Usage: cylc get-host-metrics [OPTIONS]

Get metrics for localhost, in the form of a JSON structure with top-level
keys as requested via the OPTIONS:

1. --load
       1, 5 and 15 minute load averages (as keys) from the 'uptime' command.
2. --memory
       Total free RAM memory, in kilobytes, from the 'free -k' command.
3. --disk-space=PATH / --disk-space=PATH1,PATH2,PATH3 (etc)
       Available disk space from the 'df -Pk' command, in kilobytes, for one
       or more valid mount directory PATHs (as listed under 'Mounted on')
       within the filesystem of localhost. Multiple PATH options can be
       specified via a comma-delimited list, each becoming a key under the
       top-level disk space key.

If no options are specified, --load and --memory are invoked by default.


Options:
  -h, --help         show this help message and exit
  -l, --load         1, 5 and 15 minute load averages from the 'uptime'
                     command.
  -m, --memory       Total memory not in use by the system, buffer or cache,
                     in KB, from '/proc/meminfo'.
  --disk-space=DISK  Available disk space, in KB, from the 'df -Pk' command.
15.6.3.23. get-site-config
   Usage: cylc [admin] get-site-config [OPTIONS]

Print cylc site/user configuration settings.

By default all settings are printed. For specific sections or items
use -i/--item and wrap parent sections in square brackets:
   cylc get-site-config --item '[editors]terminal'
Multiple items can be specified at once.

Options:
  -h, --help            show this help message and exit
  -i [SEC...]ITEM, --item=[SEC...]ITEM
                        Item or section to print (multiple use allowed).
  --sparse              Only print items explicitly set in the config files.
  -p, --python          Print native Python format.
  --print-run-dir       Print the configured cylc run directory.
  --print-site-dir      Print the cylc site configuration directory location.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.24. get-suite-config
   Usage: cylc [info] get-suite-config [OPTIONS] SUITE

Print parsed suite configuration items, after runtime inheritance.

By default all settings are printed. For specific sections or items
use -i/--item and wrap sections in square brackets, e.g.:
   cylc get-suite-config --item '[scheduling]initial cycle point'
Multiple items can be retrieved at once.

By default, unset values are printed as an empty string, or (for
historical reasons) as "None" with -o/--one-line. These defaults
can be changed with the -n/--null-value option.

Example:
  |# SUITE.RC
  |[runtime]
  |    [[modelX]]
  |        [[[environment]]]
  |            FOO = foo
  |            BAR = bar

$ cylc get-suite-config --item=[runtime][modelX][environment]FOO SUITE
foo

$ cylc get-suite-config --item=[runtime][modelX][environment] SUITE
FOO = foo
BAR = bar

$ cylc get-suite-config --item=[runtime][modelX] SUITE
...
[[[environment]]]
    FOO = foo
    BAR = bar
...

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help            show this help message and exit
  -i [SEC...]ITEM, --item=[SEC...]ITEM
                        Item or section to print (multiple use allowed).
  -r, --sparse          Only print items explicitly set in the config files.
  -p, --python          Print native Python format.
  -a, --all-tasks       For [runtime] items (e.g. --item='script') report
                        values for all tasks prefixed by task name.
  -n STRING, --null-value=STRING
                        The string to print for unset values (default
                        nothing).
  -m, --mark-up         Prefix each line with '!cylc!'.
  -o, --one-line        Print multiple single-value items at once.
  -t, --tasks           Print the suite task list [DEPRECATED: use 'cylc list
                        SUITE'].
  -u RUN_MODE, --run-mode=RUN_MODE
                        Get config for suite run mode.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
  --initial-cycle-point=CYCLE_POINT, --initial-point=CYCLE_POINT, --icp=CYCLE_POINT, --ict=CYCLE_POINT
                        Set the initial cycle point. Required if not defined
                        in suite.rc.
15.6.3.25. get-suite-contact
   Usage: cylc [info] get-suite-contact [OPTIONS] REG

Print contact information of running suite REG.

Arguments:
   REG               Suite name

Options:
  -h, --help     show this help message and exit
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.26. get-suite-version
   Usage: cylc [info] get-suite-version [OPTIONS] REG

Interrogate running suite REG to find what version of cylc is running it.

To find the version you've invoked at the command line see "cylc version".

Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.27. gpanel
   Usage: cylc gpanel [OPTIONS]

This is a cylc scan panel applet for monitoring running suites on a set of
hosts in GNOME 2.

To install this applet, run "cylc gpanel --install" and follow the instructions
that it gives you.

This applet can be tested using the --test option.

To customize themes, copy $CYLC_DIR/etc/gcylc.rc.eg to $HOME/.cylc/gcylc.rc and
follow the instructions in the file.

To configure default suite hosts, edit "[suite servers]scan hosts" in your
global.rc file.

Options:
  -h, --help  show this help message and exit
  --compact   Switch on compact mode at runtime.
  --install   Install the panel applet.
  --test      Run in a standalone window.
15.6.3.28. graph
   Usage: 1/ cylc [prep] graph [OPTIONS] SUITE [START[STOP]]
     Plot the suite.rc dependency graph for SUITE.
       2/ cylc [prep] graph [OPTIONS] -f,--file FILE
     Plot the specified dot-language graph file.
       3/ cylc [prep] graph [OPTIONS] --reference SUITE [START[STOP]]
     Print out a reference format for the dependencies in SUITE.
       4/ cylc [prep] graph [OPTIONS] --output-file FILE SUITE
     Plot SUITE dependencies to a file FILE with a extension-derived format.
     If FILE endswith ".png", output in PNG format, etc.

Plot suite dependency graphs in an interactive graph viewer, or (with
"--output-file") directly to image file.

See also "cylc ref-graph" to generate the plain text "reference" graph format
without the need for PyGTK to be installed.

If START is given it overrides "[visualization] initial cycle point" to
determine the start point of the graph, which defaults to the suite initial
cycle point. If STOP is given it overrides "[visualization] final cycle point"
to determine the end point of the graph, which defaults to the graph start
point plus "[visualization] number of cycle points" (which defaults to 3).
The graph start and end points are adjusted up and down to the suite initial
and final cycle points, respectively, if necessary.

The "Save" button generates an image of the current view, of format (e.g. png,
svg, jpg, eps) determined by the filename extension. If the chosen format is
not available a dialog box will show those that are available.

If the optional output filename is specified, the viewer will not open and a
graph will be written directly to the file.

GRAPH VIEWER CONTROLS:
    * Center on a node: left-click.
    * Pan view: left-drag.
    * Zoom: +/- buttons, mouse-wheel, or ctrl-left-drag.
    * Box zoom: shift-left-drag.
    * "Best Fit" and "Normal Size" buttons.
    * Left-to-right graphing mode toggle button.
    * "Ignore suicide triggers" button.
    * "Save" button: save an image of the view.
  Family (namespace) grouping controls:
    Toolbar:
    * "group" - group all families up to root.
    * "ungroup" - recursively ungroup all families.
    Right-click menu:
    * "group" - close this node's parent family.
    * "ungroup" - open this family node.
    * "recursive ungroup" - ungroup all families below this node.

Arguments:
   [SUITE]               Suite name or path
   [START]               Initial cycle point (default: suite initial point)
   [STOP]                Final cycle point (default: initial + 3 points)

Options:
  -h, --help            show this help message and exit
  -u, --ungrouped       Start with task families ungrouped (the default is
                        grouped).
  -n, --namespaces      Plot the suite namespace inheritance hierarchy (task
                        run time properties).
  -f FILE, --file=FILE  View a specific dot-language graphfile.
  --filter=NODE_NAME_PATTERN
                        Filter out one or many nodes.
  -O FILE, --output-file=FILE
                        Output to a specific file, with a format given by
                        --output-format or extrapolated from the extension.
                        '-' implies stdout in plain format.
  --output-format=FORMAT
                        Specify a format for writing out the graph to
                        --output-file e.g. png, svg, jpg, eps, dot. 'ref' is a
                        special sorted plain text format for comparison and
                        reference purposes.
  -r, --reference       Output in a sorted plain text format for comparison
                        purposes. If not given, assume --output-file=-.
  --show-suicide        Show suicide triggers.  They are not shown by default,
                        unless toggled on with the tool bar button.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
15.6.3.29. graph-diff
   Usage: cylc graph-diff [OPTIONS] SUITE1 SUITE2 -- [GRAPH_OPTIONS_ARGS]

Difference 'cylc graph --reference' output for SUITE1 and SUITE2.

OPTIONS: Use '-g' to launch a graphical diff utility.
         Use '--diff-cmd=MY_DIFF_CMD' to use a custom diff tool.

SUITE1, SUITE2: Suite names to compare.
GRAPH_OPTIONS_ARGS: Options and arguments passed directly to cylc graph.
15.6.3.30. gscan
   Usage: cylc gscan [OPTIONS]

This is the cylc scan gui for monitoring running suites on a set of
hosts.

To customize themes copy $CYLC_DIR/etc/gcylc.rc.eg to ~/.cylc/gcylc.rc and
follow the instructions in the file.

Arguments:
   [HOSTS ...]               Hosts to scan instead of the configured hosts.

Options:
  -h, --help            show this help message and exit
  -a, --all             Scan all port ranges in known hosts.
  -n PATTERN, --name=PATTERN
                        List suites with name matching PATTERN (regular
                        expression). Defaults to any name. Can be used
                        multiple times.
  -o PATTERN, --suite-owner=PATTERN
                        List suites with owner matching PATTERN (regular
                        expression). Defaults to just your own suites. Can be
                        used multiple times.
  --comms-timeout=SEC   Set a timeout for network connections to each running
                        suite. The default is 5 seconds.
  --interval=SECONDS    Time interval (in seconds) between full updates
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
15.6.3.31. gui
   Usage: cylc gui [OPTIONS] [REG] [USER_AT_HOST]
gcylc [OPTIONS] [REG] [USER_AT_HOST]

This is the cylc Graphical User Interface.

The USER_AT_HOST argument allows suite selection by 'cylc scan' output:
  cylc gui $(cylc scan | grep <suite_name>)

Local suites can be opened and switched between from within gcylc. To connect
to running remote suites (whose passphrase you have installed) you must
currently use --host and/or --user on the gcylc command line.

Available task state color themes are shown under the View menu. To customize
themes copy <cylc-dir>/etc/gcylc.rc.eg to ~/.cylc/gcylc.rc and follow the
instructions in the file.

To see current configuration settings use "cylc get-gui-config".

In the graph view, View -> Options -> "Write Graph Frames" writes .dot graph
files to the suite share directory (locally, for a remote suite). These can
be processed into a movie by $CYLC_DIR/dev/bin/live-graph-movie.sh=.

Arguments:
   [REG]                        Suite name
   [USER_AT_HOST]               user@host:port, shorthand for --user, --host & --port.

Options:
  -h, --help            show this help message and exit
  -r, --restricted      Restrict display to 'active' task states: submitted,
                        submit-failed, submit-retrying, running, failed,
                        retrying; and disable the graph view.  This may be
                        needed for very large suites. The state summary icons
                        in the status bar still represent all task proxies.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
15.6.3.32. hold
   Usage: cylc [control] hold [OPTIONS] REG [TASK_GLOB ...]

Hold a suite or tasks:
  cylc hold REG - hold a suite
  cylc hold REG TASK_GLOB ... - hold one or more tasks in a suite

Held tasks do not submit their jobs even if ready to run.

See also 'cylc [control] release'.

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  --after=CYCLE_POINT   Hold whole suite AFTER this cycle point.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.33. import-examples
   Usage: cylc import-examples DIR

Copy the cylc example suites to DIR and register them for use under the GROUP
suite name group.

Arguments:
   DIR    destination directory
15.6.3.34. insert
   Usage: cylc [control] insert [OPTIONS] TASK_GLOB [...]

Insert new task proxies into the task pool of a running suite, e.g. to allow
re-triggering of an earlier task that has already been removed from the pool.

NOTE: inserted cycling tasks cycle on as normal, even if another instance of
the same task exists at a later cycle (instances of the same task at different
cycles can coexist, but a newly spawned task will not be added to the pool if
it catches up to another task with the same ID).

See also 'cylc submit', for running tasks independently of the scheduler.

TASK_GLOB matches task or family names, to insert task instances into the pool
at a specific given cycle point. (NOTE this differs from other commands which
match name and cycle point patterns against instances already in the pool).
* CYCLE-POINT/TASK-NAME-GLOB
* CYCLE-POINT/FAMILY-NAME-GLOB
* TASK-NAME-GLOB.CYCLE-POINT
* FAMILY-NAME-GLOB.CYCLE-POINT

For example, to match, within the given cycle point (e.g. '20200202T0000Z'):
* all tasks: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks named model_N for some character N: '20200202T0000Z/model_?' or
  'model_?.20200202T0000Z'
* all tasks in 'BAR' family: '20200202T0000Z/BAR' or 'BAR.20200202T0000Z'
* all tasks in 'BAR' or 'BAZ' families: '20200202T0000Z/BA[RZ]' or
  'BA[RZ].20200202T0000Z'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   TASK_GLOB [...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  --stop-point=CYCLE_POINT, --remove-point=CYCLE_POINT
                        Optional hold/stop cycle point for inserted task.
  --no-check            Add task even if the provided cycle point is not valid
                        for the given task.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.35. jobs-kill
   Usage: cylc [control] jobs-kill JOB-LOG-ROOT [JOB-LOG-DIR ...]

(This command is for internal use. Users should use "cylc kill".) Read job
status files to obtain the names of the batch systems and the job IDs in the
systems. Invoke the relevant batch system commands to ask the batch systems to
terminate the jobs.



Arguments:
   JOB-LOG-ROOT                    The log/job sub-directory for the suite
   [JOB-LOG-DIR ...]               A point/name/submit_num sub-directory

Options:
  -h, --help     show this help message and exit
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.36. jobs-poll
   Usage: cylc [control] jobs-poll JOB-LOG-ROOT [JOB-LOG-DIR ...]

(This command is for internal use. Users should use "cylc poll".) Read job
status files to obtain the statuses of the jobs. If necessary, Invoke the
relevant batch system commands to ask the batch systems for more statuses.



Arguments:
   JOB-LOG-ROOT                    The log/job sub-directory for the suite
   [JOB-LOG-DIR ...]               A point/name/submit_num sub-directory

Options:
  -h, --help     show this help message and exit
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.37. jobs-submit
   Usage: cylc [control] jobs-submit JOB-LOG-ROOT [JOB-LOG-DIR ...]

(This command is for internal use. Users should use "cylc submit".) Submit task
jobs to relevant batch systems. On a remote job host, this command reads the
job files from STDIN.



Arguments:
   JOB-LOG-ROOT                    The log/job sub-directory for the suite
   [JOB-LOG-DIR ...]               A point/name/submit_num sub-directory

Options:
  -h, --help     show this help message and exit
  --remote-mode  Is this being run on a remote job host?
  --utc-mode     (for remote mode) is the suite running in UTC mode?
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.38. jobscript
   Usage: cylc [prep] jobscript [OPTIONS] REG TASK

Generate a task job script and print it to stdout.

Here's how to capture the script in the vim editor:
  % cylc jobscript REG TASK | vim -
Emacs unfortunately cannot read from stdin:
  % cylc jobscript REG TASK > tmp.sh; emacs tmp.sh

This command wraps 'cylc [control] submit --dry-run'.
Other options (e.g. for suite host and owner) are passed
through to the submit command.

Options:
  -h, --help   Print this usage message.
  -e --edit    Open the jobscript in a CLI text editor.
  -g --gedit   Open the jobscript in a GUI text editor.
  --plain      Don't print the "Task Job Script Generated message."
 (see also 'cylc submit --help')

Arguments:
  REG          Registered suite name.
  TASK         Task ID (NAME.CYCLE_POINT)
15.6.3.39. kill
   Usage: cylc [control] kill [OPTIONS] REG [TASK_GLOB ...]

Kill jobs of active tasks and update their statuses accordingly.
 cylc kill REG TASK_GLOB ... - kill one or more active tasks
 cylc kill REG - kill all active tasks in the suite

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.40. list
   Usage: cylc [info|prep] list|ls [OPTIONS] SUITE

Print runtime namespace names (tasks and families), the first-parent
inheritance graph, or actual tasks for a given cycle range.

The first-parent inheritance graph determines the primary task family
groupings that are collapsible in gcylc suite views and the graph
viewer tool. To visualize the full multiple inheritance hierarchy use:
  'cylc graph -n'.

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help            show this help message and exit
  -a, --all-tasks       Print all tasks, not just those used in the graph.
  -n, --all-namespaces  Print all runtime namespaces, not just tasks.
  -m, --mro             Print the linear "method resolution order" for each
                        namespace (the multiple-inheritance precedence order
                        as determined by the C3 linearization algorithm).
  -t, --tree            Print the first-parent inheritance hierarchy in tree
                        form.
  -b, --box             With -t/--tree, using unicode box characters. Your
                        terminal must be able to display unicode characters.
  -w, --with-titles     Print namespaces titles too.
  -p START[,STOP], --points=START[,STOP]
                        Print actual task IDs from the START [through STOP]
                        cycle points.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
  --initial-cycle-point=CYCLE_POINT, --initial-point=CYCLE_POINT, --icp=CYCLE_POINT, --ict=CYCLE_POINT
                        Set the initial cycle point. Required if not defined
                        in suite.rc.
15.6.3.41. ls-checkpoints
   Usage: cylc [info] ls-checkpoints [OPTIONS] REG [ID ...]

In the absence of arguments and the --all option, list checkpoint IDs, their
time and events. Otherwise, display the latest and/or the checkpoints of suite
parameters, task pool and broadcast states in the suite runtime database.


Arguments:
   REG                    Suite name
   [ID ...]               Checkpoint ID (default=latest)

Options:
  -h, --help     show this help message and exit
  -a, --all      Display data of all available checkpoints.
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.42. message
   Usage: cylc [task] message [OPTIONS] -- [REG] [TASK-JOB] [[SEVERITY:]MESSAGE ...]

Record task job messages.

Send task job messages to:
- The job stdout/stderr.
- The job status file, if there is one.
- The suite server program, if communication is possible.

Task jobs use this command to record and report status such as success and
failure. Applications run by task jobs can use this command to report messages
and to report registered task outputs.

Messages can be specified as arguments. A '-' indicates that the command should
read messages from STDIN. When reading from STDIN, multiple messages are
separated by empty lines. Examples:

Single message as an argument:
 % cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" 'Hello world!'

Multiple messages as arguments:
 % cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" \
        'Hello world!' 'Hi' 'WARNING:Hey!'

Multiple messages on STDIN:
 % cylc message -- "${CYLC_SUITE_NAME}" "${CYLC_TASK_JOB}" - <<'__STDIN__'
 % Hello
 % world!
 %
 % Hi
 %
 % WARNING:Hey!
 %__STDIN__

Note "${CYLC_SUITE_NAME}" and "${CYLC_TASK_JOB}" are made available in task job
environments - you do not need to write their actual values in task scripting.

Each message can be prefixed with a severity level using the syntax 'SEVERITY:
MESSAGE'.

The default message severity is INFO. The --severity=SEVERITY option can be
used to set the default severity level for all unprefixed messages.

Note: to abort a job script with a custom error message, use cylc__job_abort:
  cylc__job_abort 'message...'
(For technical reasons this is a shell function, not a cylc sub-command.)

For backward compatibility, if number of arguments is less than or equal to 2,
the command assumes the classic interface, where all arguments are messages.
Otherwise, the first 2 arguments are assumed to be the suite name and the task
job identifier.


Arguments:
   [REG]                                  Suite name
   [TASK-JOB]                             Task job identifier CYCLE/TASK_NAME/SUBMIT_NUM
   [[SEVERITY:]MESSAGE ...]               Messages

Options:
  -h, --help            show this help message and exit
  -s SEVERITY, -p SEVERITY, --severity=SEVERITY, --priority=SEVERITY
                        Set severity levels for messages that do not have one
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.43. monitor
   Usage: cylc [info] monitor [OPTIONS] REG [USER_AT_HOST]

A terminal-based live suite monitor.  Exit with 'Ctrl-C'.

The USER_AT_HOST argument allows suite selection by 'cylc scan' output:
  cylc monitor $(cylc scan | grep <suite_name>)


Arguments:
   REG                          Suite name
   [USER_AT_HOST]               user@host:port, shorthand for --user, --host & --port.

Options:
  -h, --help            show this help message and exit
  -a, --align           Align task names. Only useful for small suites.
  -r, --restricted      Restrict display to active task states. This may be
                        useful for monitoring very large suites. The state
                        summary line still reflects all task proxies.
  -s ORDER, --sort=ORDER
                        Task sort order: "definition" or "alphanumeric".The
                        default is definition order, as determined by global
                        config. (Definition order is the order that tasks
                        appear under [runtime] in the suite definition).
  -o, --once            Show a single view then exit.
  -u, --runahead        Display task proxies in the runahead pool (off by
                        default).
  -i SECONDS, --interval=SECONDS
                        Interval between suite state retrievals, in seconds
                        (default 1).
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
15.6.3.44. nudge
   Usage: cylc [control] nudge [OPTIONS] REG

Cause the cylc task processing loop to be invoked in a running suite.

This happens automatically when the state of any task changes such that
task processing (dependency negotation etc.) is required, or if a
clock-trigger task is ready to run.

The main reason to use this command is to update the "estimated time till
completion" intervals shown in the tree-view suite control GUI, during
periods when nothing else is happening.


Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.45. ping
   Usage: cylc [discovery] ping [OPTIONS] REG [TASK]

If suite REG is running or TASK in suite REG is currently running,
exit with success status, else exit with error status.

Arguments:
   REG                  Suite name
   [TASK]               Task NAME.CYCLE_POINT

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.46. poll
   Usage: cylc [control] poll [OPTIONS] REG [TASK_GLOB ...]

Poll (query) task jobs to verify and update their statuses.
  cylc poll REG - poll all active tasks
  cylc poll REG TASK_GLOB ... - poll multiple active tasks or families

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  -s, --succeeded       Allow polling of succeeded tasks.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.47. print
   Usage: cylc [prep] print [OPTIONS] [REGEX]

Print registered (installed) suites.

Note on result filtering:
  (a) The filter patterns are Regular Expressions, not shell globs, so
the general wildcard is '.*' (match zero or more of anything), NOT '*'.
  (b) For printing purposes there is an implicit wildcard at the end of
each pattern ('foo' is the same as 'foo/*'); use the string end marker
to prevent this ('foo$' matches only literal 'foo').

Arguments:
   [REGEX]               Suite name regular expression pattern

Options:
  -h, --help     show this help message and exit
  -t, --tree     Print suites in nested tree form.
  -b, --box      Use unicode box drawing characters in tree views.
  -a, --align    Align columns.
  -x             don't print suite definition directory paths.
  -y             Don't print suite titles.
  --fail         Fail (exit 1) if no matching suites are found.
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.48. profile-battery
   Usage: cylc profile-battery [-e [EXPERIMENT ...]] [-v [VERSION ...]]

Run profiling experiments against different versions of cylc. A list of
experiments can be specified after the -e flag, if not provided the experiment
"complex" will be chosen. A list of versions to profile against can be
specified after the -v flag, if not provided the current version will be used.

Experiments are stored in etc/profile-experiments, user experiments can be
stored in .profiling/experiments. Experiments are specified without the file
extension, experiments in .profiling/ will be chosen before those in etc/.

IMPORTANT: See etc/profile-experiments/example for an experiment template with
further details.

Versions are any valid git identifiers i.e. tags, branches, commits. To compare
results to different cylc versions either:
    * Supply cylc profile-battery with a complete list of the versions you wish
      to profile, it will then provide the option to checkout the required
      versions automatically.
    * Checkout each version manually running cylc profile-battery against only
      one version at a time. Once all results have been gathered you can then
      run cylc profile-battery with a complete list of versions.

Profiling will save results to .profiling/results.json where they can be used
for future comparisons. To list profiling results run:
    * cylc profile-battery --ls  # list all results
    * cylc profile-battery --ls -e experiment  # list all results for
                                               # experiment "experiment".
    * cylc profile-battery --ls --delete -v  6.1.2  # Delete all results for
                                                    # version 6.1.2 (prompted).

If matplotlib and numpy are installed profiling generates plots which are
saved to .profiling/plots or presented in an interactive window using the -i
flag.

Results are stored along with a checksum for the experiment file. When an
experiment file is changed previous results are maintained, future results will
be stored separately. To copy results from an older version of an experiment
into those from the current one run:
    * cylc profile-battery --promote experiment@checksum
NOTE: At present results cannot be analysed without the experiment file so old
results must be "copied" in this way to be re-used.

The results output contain only a small number of metrics, to see a full list
of results use the --full option.


Options:
  -h, --help            show this help message and exit
  -e, --experiments     Specify list of experiments to run.
  -v, --versions        Specify cylc versions to profile. Git tags, branches,
                        commits are all valid.
  -i, --interactive     Open any plots in interactive window rather saving
                        them to files.
  -p, --no-plots        Don't generate any plots.
  --ls, --list-results  List all stored results. Experiments and versions to
                        list can be specified using --experiments and
                        --versions.
  --delete              Delete stored results (to be used in combination with
                        --list-results).
  -y, --yes             Answer yes to any user input. Will check-out cylc
                        versions as required.
  --full-results, --full
                        Display all gathered metrics.
  --lobf-order=LOBF_ORDER
                        The order (int)of the line of best fit to be drawn. 0
                        for no lobf, 1 for linear, 2 for quadratic ect.
  --promote=PROMOTE     Promote results from an older version of an experiment
                        to the current version. To be used when making non-
                        functional changes to an experiment.
  --test                For development purposes, run experiment without
                        saving results and regardless of any prior runs.
15.6.3.49. ref-graph
   Usage: Usage:
    cylc ref-graph SUITE [START] [STOP]

Implement the old ``cylc graph --reference command`` for producing a
text-format representation of a suite graph.

The `--reference` flag is optional here.

THIS IS A BACK-PORT OF the Python 3 bin/cylc-graph FOR CYLC 8, TO GENERATE
TEXT-FORMAT "REFERENCE GRAPHS" WITHOUT THE NEED FOR PYGTK TO BE INSTALLED.

Arguments:
   [SUITE]               Suite name or path
   [START]               Initial cycle point (default: suite initial point)
   [STOP]                Final cycle point (default: initial + 3 points)

Options:
  -h, --help            show this help message and exit
  -u, --ungrouped       Start with task families ungrouped (the default is
                        grouped).
  -n, --namespaces      Plot the suite namespace inheritance hierarchy (task
                        run time properties).
  -r, --reference       Output in a sorted plain text format for comparison
                        purposes.
  --show-suicide        Show suicide triggers.  They are not shown by default,
                        unless toggled on with the tool bar button.
  --icp=CYCLE_POINT     Set initial cycle point. Required if not defined in
                        suite.rc.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
15.6.3.50. register
   Usage: cylc [prep] register [OPTIONS] [REG] [PATH]

Register the name REG for the suite definition in PATH. The suite server
program can then be started, stopped, and targeted by name REG. (Note that
"cylc run" can also register suites on the fly).

Registration creates a suite run directory "~/cylc-run/REG/" containing a
".service/source" symlink to the suite definition location. The .service
directory is also used for server authentication files at runtime. With the
"--run-dir" option "~/cylc-run/REG" can be symlinked to another location.

Suite names can be hierarchical, corresponding to path under the run directory.

  % cylc register dogs/fido PATH
Register PATH/suite.rc as dogs/fido, with run directory ~/cylc-run/dogs/fido.

  % cylc register dogs/fido
Register $PWD/suite.rc as dogs/fido.

  % cylc register
Register $PWD/suite.rc as the parent directory name: $(basename $PWD).

The same suite can be registered with multiple names; this results in multiple
suite run directories that link to the same suite definition.

To "unregister" a suite, delete or rename its run directory (renaming it under
~/cylc-run effectively re-registers the original suite with the new name).

Use of "--redirect" is required to allow an existing name (and run directory)
to be associated with a different suite definition. This is potentially
dangerous because the new suite will overwrite files in the existing run
directory. You should consider deleting or renaming an existing run directory
rather than just re-use it with another suite.

Arguments:
   [REG]                Suite name
   [PATH]               Suite definition directory (defaults to $PWD)

Options:
  -h, --help        show this help message and exit
  --redirect        Allow an existing suite name and run directory to be used
                    with another suite.
  --run-dir=RUNDIR  Symlink $HOME/cylc-run/REG to RUNDIR/REG.
  --user=USER       Other user account name. This results in command
                    reinvocation on the remote account.
  --host=HOST       Other host name. This results in command reinvocation on
                    the remote account.
  -v, --verbose     Verbose output mode.
  --debug           Output developer information and show exception
                    tracebacks.
15.6.3.51. release
   Usage: cylc [control] release|unhold [OPTIONS] REG [TASK_GLOB ...]

Release a held suite or tasks.
  cylc release REG - release the suite
  cylc release REG TASK_GLOB ... - release one or more tasks

Held tasks do not submit their jobs even if ready to run.

See also 'cylc [control] hold'.

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.52. reload
   Usage: cylc [control] reload [OPTIONS] REG

Tell a suite to reload its definition at run time. All settings
including task definitions, with the exception of suite log
configuration, can be changed on reload. Note that defined tasks can be
be added to or removed from a running suite with the 'cylc insert' and
'cylc remove' commands, without reloading. This command also allows
addition and removal of actual task definitions, and therefore insertion
of tasks that were not defined at all when the suite started (you will
still need to manually insert a particular instance of a newly defined
task). Live task proxies that are orphaned by a reload (i.e. their task
definitions have been removed) will be removed from the task pool if
they have not started running yet. Changes to task definitions take
effect immediately, unless a task is already running at reload time.

If the suite was started with Jinja2 template variables set on the
command line (cylc run --set FOO=bar REG) the same template settings
apply to the reload (only changes to the suite.rc file itself are
reloaded).

If the modified suite definition does not parse, failure to reload will
be reported but no harm will be done to the running suite.

Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.53. remote-init
   Usage: cylc [task] remote-init [--indirect-comm=ssh] UUID RUND

(This command is for internal use.)
Install suite service files on a task remote (i.e. a [owner@]host):
    .service/contact: All task -> suite communication methods.
    .service/passphrase: Direct task -> suite HTTP(S) communication only.
    .service/ssl.cert: Direct task -> suite HTTPS communication only.

Content of items to install from a tar file read from STDIN.

Return:
    0:
        On success or if initialisation not required:
        - Print SuiteSrvFilesManager.REMOTE_INIT_NOT_REQUIRED if initialisation
          not required (e.g. remote has shared file system with suite host).
        - Print SuiteSrvFilesManager.REMOTE_INIT_DONE on success.
    1:
        On failure.



Arguments:
   UUID               UUID of current suite server process
   RUND               The run directory of the suite

Options:
  -h, --help            show this help message and exit
  --indirect-comm=METHOD
                        specify use of indirect communication via e.g. ssh
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.54. remote-tidy
   Usage: cylc [task] remote-tidy RUND

(This command is for internal use.)
Remove ".service/contact" from a task remote (i.e. a [owner@]host).
Remove ".service" directory on the remote if emptied.



Arguments:
   RUND               The run directory of the suite

Options:
  -h, --help     show this help message and exit
  --user=USER    Other user account name. This results in command reinvocation
                 on the remote account.
  --host=HOST    Other host name. This results in command reinvocation on the
                 remote account.
  -v, --verbose  Verbose output mode.
  --debug        Output developer information and show exception tracebacks.
15.6.3.55. remove
   Usage: cylc [control] remove [OPTIONS] REG TASK_GLOB [...]

Remove one or more task instances from a running suite.

Tasks will be forced to spawn successors before removal if they have not done
so already, unless you use '--no-spawn'.

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   TASK_GLOB [...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  --no-spawn            Do not spawn successors before removal.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.56. report-timings
   Usage: cylc [util] report-timings [OPTIONS] REG

Retrieve suite timing information for wait and run time performance analysis.
Raw output and summary output (in text or HTML format) are available.  Output
is sent to standard output, unless an output filename is supplied.

Summary Output (the default):
Data stratified by host and batch system that provides a statistical
summary of
    1. Queue wait time (duration between task submission and start times)
    2. Task run time (duration between start and succeed times)
    3. Total run time (duration between task submission and succeed times)
Summary tables can be output in plain text format, or HTML with embedded SVG
boxplots.  Both summary options require the Pandas library, and the HTML
summary option requires the Matplotlib library.

Raw Output:
A flat list of tabular data that provides (for each task and cycle) the
    1. Time of successful submission
    2. Time of task start
    3. Time of task successful completion
as well as information about the batch system and remote host to permit
stratification/grouping if desired by downstream processors.

Timings are shown only for succeeded tasks.

For long-running and/or large suites (i.e. for suites with many task events),
the database query to obtain the timing information may take some time.



Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  -r, --raw             Show raw timing output suitable for custom
                        diagnostics.
  -s, --summary         Show textual summary timing output for tasks.
  -w, --web-summary     Show HTML summary timing output for tasks.
  -O OUTPUT_FILENAME, --output-file=OUTPUT_FILENAME
                        Output to a specific file
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.57. reset
   Usage: cylc [control] reset [OPTIONS] REG [TASK_GLOB ...]

Force task instances to a specified state.
  cylc reset --state=xxx REG - reset all tasks to state xxx
  cylc reset --state=xxx REG TASK_GLOB ... - reset one or more tasks to xxx

Outputs are automatically updated to reflect the new task state, except for
custom message outputs which can be manipulated directly with "--output".

Prerequisites reflect the state of other tasks; they are not changed except
to unset them on resetting state to 'waiting' or earlier.

To hold and release tasks use "cylc hold" and "cylc release", not this command.

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  -s STATE, --state=STATE
                        Reset task state to STATE, can be succeeded, waiting,
                        submitted, failed, running, submit-failed, expired
  -O OUTPUT, --output=OUTPUT
                        Find task output by message string or trigger string,
                        set complete or incomplete with !OUTPUT, '*' to set
                        all complete, '!*' to set all incomplete. Can be used
                        more than once to reset multiple task outputs.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.58. restart
   Usage: cylc [control] restart [OPTIONS] [REG]

Start a suite run from the previous state. To start from scratch (cold or warm
start) see the 'cylc run' command.

The scheduler runs as a daemon unless you specify --no-detach.

Tasks recorded as submitted or running are polled at start-up to determine what
happened to them while the suite was down.

Arguments:
   [REG]               Suite name

Options:
  -h, --help            show this help message and exit
  -n, --no-detach, --non-daemon
                        Do not daemonize the suite
  -a, --no-auto-shutdown
                        Do not shut down the suite automatically when all
                        tasks have finished. This flag overrides the
                        corresponding suite config item.
  --auto-shutdown       Shut down the suite automatically when all tasks have
                        finished. This flag overrides the corresponding suite
                        config item.
  --profile             Output profiling (performance) information
  --checkpoint=CHECKPOINT-ID
                        Specify the ID of a checkpoint to restart from
  --ignore-initial-cycle-point
                        Ignore the initial cycle point in the suite run
                        database. If one is specified in the suite definition
                        it will be used, however.
  --ignore-final-cycle-point
                        Ignore the final cycle point in the suite run
                        database. If one is specified in the suite definition
                        it will be used, however.
  --ignore-start-cycle-point
                        Ignore the start cycle point in the suite run
                        database.
  --ignore-stop-cycle-point
                        Ignore the stop cycle point in the suite run database.
  --final-cycle-point=CYCLE_POINT, --final-point=CYCLE_POINT, --until=CYCLE_POINT, --fcp=CYCLE_POINT
                        Set the final cycle point.
  --stop-cycle-point=CYCLE_POINT, --stop-point=CYCLE_POINT
                        Set stop point. Shut down after all tasks have PASSED
                        this cycle point. (Not to be confused with the final
                        cycle point.)
  --hold                Hold suite immediately on starting.
  --hold-point=CYCLE_POINT, --hold-after=CYCLE_POINT
                        Set hold cycle point. Hold suite AFTER all tasks have
                        PASSED this cycle point.
  -m STRING, --mode=STRING
                        Run mode: live, dummy, dummy-local, simulation
                        (default live).
  --reference-log       Generate a reference log for use in reference tests.
  --reference-test      Do a test run against a previously generated reference
                        log.
  --host=HOST           Specify the host on which to start-up the suite. If
                        not specified, a host will be selected using the
                        'suite servers' global config.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
15.6.3.59. review
   Usage: cylc [info] review [OPTIONS] [start [PORT]] [stop]

Start/stop ad-hoc Cylc Review web service server for browsing users' suite
logs via an HTTP interface.

With no arguments, the status of the ad-hoc web service server is printed.

For 'cylc review start', if 'PORT' is not specified, port 8080 is used.

Arguments:
   [start [PORT]]               Start ad-hoc web service server.
   [stop]                       Stop ad-hoc web service server.

Options:
  -h, --help            show this help message and exit
  -y, --non-interactive, --yes
                        Switch off interactive prompting i.e. answer yes to
                        everything (for stop only).
  -R, --service-root    Include web service name under root of URL (for start
                        only).
15.6.3.60. run
   Usage: cylc [control] run|start [OPTIONS] [[REG] [START_POINT] ]

Start a suite run from scratch, ignoring dependence prior to the start point.

WARNING: this will wipe out previous suite state. To restart from a previous
state, see 'cylc restart --help'.

The scheduler will run as a daemon unless you specify --no-detach.

If the suite is not already registered (by "cylc register" or a previous run)
it will be registered on the fly before start up.

% cylc run REG
  Run the suite registered with name REG.

% cylc run
  Register $PWD/suite.rc as $(basename $PWD) and run it.
 (Note REG must be given explicitly if START_POINT is on the command line.)

A "cold start" (the default) starts from the suite initial cycle point
(specified in the suite.rc or on the command line). Any dependence on tasks
prior to the suite initial cycle point is ignored.

A "warm start" (-w/--warm) starts from a given cycle point later than the suite
initial cycle point (specified in the suite.rc). Any dependence on tasks prior
to the given warm start cycle point is ignored. The suite initial cycle point
is preserved.

Arguments:
   [REG]                       Suite name
   [START_POINT]               Initial cycle point or 'now';
                               overrides the suite definition.

Options:
  -h, --help            show this help message and exit
  -n, --no-detach, --non-daemon
                        Do not daemonize the suite
  -a, --no-auto-shutdown
                        Do not shut down the suite automatically when all
                        tasks have finished. This flag overrides the
                        corresponding suite config item.
  --auto-shutdown       Shut down the suite automatically when all tasks have
                        finished. This flag overrides the corresponding suite
                        config item.
  --profile             Output profiling (performance) information
  -w, --warm            Warm start the suite. The default is to cold start.
  --start-cycle-point=CYCLE_POINT, --start-point=CYCLE_POINT
                        Set the start cycle point. Implies --warm.(Not to be
                        confused with the initial cycle point.)
  --final-cycle-point=CYCLE_POINT, --final-point=CYCLE_POINT, --until=CYCLE_POINT, --fcp=CYCLE_POINT
                        Set the final cycle point.
  --stop-cycle-point=CYCLE_POINT, --stop-point=CYCLE_POINT
                        Set stop point. Shut down after all tasks have PASSED
                        this cycle point. (Not to be confused with the final
                        cycle point.)
  --hold                Hold suite immediately on starting.
  --hold-point=CYCLE_POINT, --hold-after=CYCLE_POINT
                        Set hold cycle point. Hold suite AFTER all tasks have
                        PASSED this cycle point.
  -m STRING, --mode=STRING
                        Run mode: live, dummy, dummy-local, simulation
                        (default live).
  --reference-log       Generate a reference log for use in reference tests.
  --reference-test      Do a test run against a previously generated reference
                        log.
  --host=HOST           Specify the host on which to start-up the suite. If
                        not specified, a host will be selected using the
                        'suite servers' global config.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
  --initial-cycle-point=CYCLE_POINT, --initial-point=CYCLE_POINT, --icp=CYCLE_POINT, --ict=CYCLE_POINT
                        Set the initial cycle point. Required if not defined
                        in suite.rc.
15.6.3.61. scan
   Usage: cylc [discovery] scan [OPTIONS] [HOSTS ...]

Print information about running suites.

By default, it will obtain a listing of running suites for the current user
from the file system, before connecting to the suites to obtain information.
Use the -o/--suite-owner option to get information of running suites for other
users.

If a list of HOSTS is specified, it will obtain a listing of running suites by
scanning all ports in the relevant range for running suites on the specified
hosts. If the -a/--all option is specified, it will use the global
configuration "[suite servers]scan hosts" setting to determine a list of hosts
to scan.

Suite passphrases are not needed to get identity information (name and owner).
Titles, descriptions, state totals, and cycle point state totals may also be
revealed publicly, depending on global and suite authentication settings. Suite
passphrases still grant full access regardless of what is revealed publicly.

WARNING: a suite suspended with Ctrl-Z will cause port scans to hang until the
connection times out (see --comms-timeout).

Arguments:
   [HOSTS ...]               Hosts to scan instead of the configured hosts.

Options:
  -h, --help            show this help message and exit
  -a, --all             Scan all port ranges in known hosts.
  -n PATTERN, --name=PATTERN
                        List suites with name matching PATTERN (regular
                        expression). Defaults to any name. Can be used
                        multiple times.
  -o PATTERN, --suite-owner=PATTERN
                        List suites with owner matching PATTERN (regular
                        expression). Defaults to current user. Use '.*' to
                        match all known users. Can be used multiple times.
  -d, --describe        Print suite metadata if available.
  -s, --state-totals    Print number of tasks in each state if available
                        (total, and by cycle point).
  -f, --full            Print all available information about each suite.
  -c, --color, --colour
                        Print task state summaries using terminal color
                        control codes.
  -b, --no-bold         Don't use any bold text in the command output.
  --print-ports         Print the port range from the global config file.
  --comms-timeout=SEC   Set a timeout for network connections to each running
                        suite. The default is 5 seconds.
  --old, --old-format   Legacy output format ("suite owner host port").
  -r, --raw, --raw-format
                        Parsable format ("suite|owner|host|property|value").
  -j, --json, --json-format
                        JSON format.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
15.6.3.62. scp-transfer
   Usage: cylc [util] scp-transfer [OPTIONS]

An scp wrapper for transferring a list of files and/or directories
at once. The source and target scp URLs can be local or remote (scp
can transfer files between two remote hosts). Passwordless ssh must
be configured appropriately.

ENVIRONMENT VARIABLE INPUTS:
$SRCE  - list of sources (files or directories) as scp URLs.
$DEST  - parallel list of targets as scp URLs.
The source and destination lists should be space-separated.

We let scp determine the validity of source and target URLs.
Target directories are created pre-copy if they don't exist.

Options:
 -v     - verbose: print scp stdout.
 --help - print this usage message.
15.6.3.64. set-verbosity
   Usage: cylc [control] set-verbosity [OPTIONS] REG LEVEL

Change the logging severity level of a running suite.  Only messages at
or above the chosen severity level will be logged; for example, if you
choose WARNING, only warnings and critical messages will be logged.

Arguments:
   REG                 Suite name
   LEVEL               INFO, WARNING, NORMAL, CRITICAL, ERROR, DEBUG

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.65. show
   Usage: cylc [info] show [OPTIONS] REG [TASK_NAME or TASK_GLOB ...]

Query a running suite for:
  cylc show REG - suite metadata
  cylc show REG TASK_NAME - task metadata
  cylc show REG TASK_GLOB - prerequisites and outputs of matched task instances

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                                        Suite name
   [TASK_NAME or TASK_GLOB ...]               Task names or match patterns

Options:
  -h, --help            show this help message and exit
  --list-prereqs        Print a task's pre-requisites as a list.
  --json                Print output in JSON format.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.66. spawn
   Usage: cylc [control] spawn [OPTIONS] REG [TASK_GLOB ...]

Force task proxies to spawn successors at their own next cycle point.
  cylc spawn REG - force spawn all tasks in a suite
  cylc spawn REG TASK_GLOB ... - force spawn one or more tasks in a suite

Tasks normally spawn on reaching the "submitted" status. Spawning them early
allows running successive instances of the same task out of order. See also
the "spawn to max active cycle points" suite configuration.

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.67. stop
   Usage: cylc [control] stop|shutdown [OPTIONS] REG [STOP]

Tell a suite server program to shut down. In order to prevent failures going
unnoticed, suites only shut down automatically at a final cycle point if no
failed tasks are present. There are several shutdown methods:

  1. (default) stop after current active tasks finish
  2. (--now) stop immediately, orphaning current active tasks
  3. (--kill) stop after killing current active tasks
  4. (with STOP as a cycle point) stop after cycle point STOP
  5. (with STOP as a task ID) stop after task ID STOP has succeeded
  6. (--wall-clock=T) stop after time T (an ISO 8601 date-time format e.g.
     CCYYMMDDThh:mm, CCYY-MM-DDThh, etc).

Tasks that become ready after the shutdown is ordered will be submitted
immediately if the suite is restarted.  Remaining task event handlers and job
poll and kill commands, however, will be executed prior to shutdown, unless
--now is used.

This command exits immediately unless --max-polls is greater than zero, in
which case it polls to wait for suite shutdown.

Arguments:
   REG                  Suite name
   [STOP]               a/ task POINT (cycle point), or
                            b/ ISO 8601 date-time (clock time), or
                            c/ TASK (task ID).

Options:
  -h, --help            show this help message and exit
  -k, --kill            Shut down after killing currently active tasks.
  -n, --now             Shut down without waiting for active tasks to
                        complete. If this option is specified once, wait for
                        task event handler, job poll/kill to complete. If this
                        option is specified more than once, tell the suite to
                        terminate immediately.
  -w STOP, --wall-clock=STOP
                        Shut down after time STOP (ISO 8601 formatted)
  --max-polls=INT       Maximum number of polls (default 0).
  --interval=SECS       Polling interval in seconds (default 60).
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
15.6.3.68. submit
   Usage: cylc [task] submit|single [OPTIONS] REG TASK [...]

Submit a single task to run just as it would be submitted by its suite.  Task
messaging commands will print to stdout but will not attempt to communicate
with the suite (which does not need to be running).

For tasks present in the suite graph the given cycle point is adjusted up to
the next valid cycle point for the task. For tasks defined under runtime but
not present in the graph, the given cycle point is assumed to be valid.

WARNING: do not 'cylc submit' a task that is running in its suite at the
same time - both instances will attempt to write to the same job logs.

Arguments:
   REG                      Suite name
   TASK [...]               Family or task ID (NAME.CYCLE_POINT)

Options:
  -h, --help            show this help message and exit
  -d, --dry-run         Generate the job script for the task, but don't submit
                        it.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
  --initial-cycle-point=CYCLE_POINT, --initial-point=CYCLE_POINT, --icp=CYCLE_POINT, --ict=CYCLE_POINT
                        Set the initial cycle point. Required if not defined
                        in suite.rc.
15.6.3.69. suite-state
   Usage: cylc suite-state REG [OPTIONS]

Print task states retrieved from a suite database; or (with --task,
--point, and --status) poll until a given task reaches a given state; or (with
--task, --point, and --message) poll until a task receives a given message.
Polling is configurable with --interval and --max-polls; for a one-off
check use --max-polls=1. The suite database does not need to exist at
the time polling commences but allocated polls are consumed waiting for
it (consider max-polls*interval as an overall timeout).

Note for non-cycling tasks --point=1 must be provided.

For your own suites the database location is determined by your
site/user config. For other suites, e.g. those owned by others, or
mirrored suite databases, use --run-dir=DIR to specify the location.

Example usages:
  cylc suite-state REG --task=TASK --point=POINT --status=STATUS
returns 0 if TASK.POINT reaches STATUS before the maximum number of
polls, otherwise returns 1.

  cylc suite-state REG --task=TASK --point=POINT --status=STATUS --offset=PT6H
adds 6 hours to the value of CYCLE for carrying out the polling operation.

  cylc suite-state REG --task=TASK --status=STATUS --task-point
uses CYLC_TASK_CYCLE_POINT environment variable as the value for the CYCLE
to poll. This is useful when you want to use cylc suite-state in a cylc task.


Arguments:
   REG               Suite name

Options:
  -h, --help            show this help message and exit
  -t TASK, --task=TASK  Specify a task to check the state of.
  -p CYCLE, --point=CYCLE
                        Specify the cycle point to check task states for.
  -T, --task-point      Use the CYLC_TASK_CYCLE_POINT environment variable as
                        the cycle point to check task states for. Shorthand
                        for --point=$CYLC_TASK_CYCLE_POINT
  --template=TEMPLATE   Remote cyclepoint template (IGNORED - this is now
                        determined automatically).
  -d DIR, --run-dir=DIR
                        The top level cylc run directory if non-standard. The
                        database should be DIR/REG/log/db. Use to interrogate
                        suites owned by others, etc.; see note above.
  -s OFFSET, --offset=OFFSET
                        Specify an offset to add to the targeted cycle point
  -S STATUS, --status=STATUS
                        Specify a particular status or triggering condition to
                        check for. Valid triggering conditions to check for
                        include: 'fail', 'finish', 'start', 'submit' and
                        'succeed'. Valid states to check for include:
                        'runahead', 'waiting', 'held', 'queued', 'expired',
                        'ready', 'submit-failed', 'submit-retrying',
                        'submitted', 'retrying', 'running', 'failed' and
                        'succeeded'.
  -O MSG, -m MSG, --output=MSG, --message=MSG
                        Check custom task output by message string or trigger
                        string.
  --max-polls=INT       Maximum number of polls (default 10).
  --interval=SECS       Polling interval in seconds (default 60).
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
15.6.3.70. test-battery
   cd "/Users/oliver/cylc-flow"
Usage: cylc test-battery [...]

Run automated Cylc and Parsec tests, under (by default):
   /Users/oliver/cylc-flow/tests/.

Options and arguments are appended to "prove -j $NPROC -s -r ${@:-tests}".
NPROC is the number of concurrent processes to run, which defaults to the
global config "process pool size" setting.

The tests ignore normal site/user global config and instead use the file:
   /Users/oliver/cylc-flow/etc/global-tests.rc
This should specify test job hosts under the [test battery] section, plus any
other critical settings settings, including [hosts] configuration for test job
hosts (and special batchview commands like qcat if available). Additional
global config items can be added on the fly using the create_test_globalrc
shell function defined in the test_header.

Suite run directories are only cleaned up for passing tests on the suite host.

Set "export CYLC_TEST_DEBUG=true" to print failed-test stderr to the terminal.

To change the test file comparision command from "diff -u" do (for example):
   export CYLC_TEST_DIFF_CMD='xxdiff -D'

Some test suites submit jobs to the 'at' so atd must be up on the job hosts.

Commits or Pull Requests to cylc/cylc-flow on GitHub will trigger Travis CI to
run generic (non platform-specific) tests - see /Users/oliver/cylc-flow/.travis.yml.

By default all tests are executed.  To run just a subset of them:
  * list individual tests or test directories to run on the comand line
  * list individual tests or test directories to skip in $CYLC_TEST_SKIP
  * skip all generic tests with CYLC_TEST_RUN_GENERIC=false
  * skip all platform-specific tests with CYLC_TEST_RUN_PLATFORM=false
  List specific tests relative to /Users/oliver/cylc-flow (i.e. starting with "test/").
Some platform-specific tests are automatically skipped, depending on platform.

Platform-specific tests must set "CYLC_TEST_IS_GENERIC=false" before sourcing
the test_header.

Tests requiring the sqlite3 CLI must be skipped if sqlite3 is not installed (it
is not otherwise a Cylc software prerequisite):
| if ! which sqlite3 > /dev/null; then
|     # Skip the remaining 3 tests.
|     skip 3 "sqlite3 not installed?"
|     purge_suite $SUITE_NAME
|     exit 0
| fi

Options:
  -h, --help       Print this help message and exit.
  --chunk CHUNK    Divide the test battery into chunks and run the specified
                   chunk. CHUNK takes the format 'a/b' where 'b' is the number
                   of chunks to divide the battery into and 'a' is the number
                   of the chunk to run (1 >= a >= b).

Examples:

Run the full test suite with the default options.
  cylc test-battery
Run the full test suite with 12 processes
  cylc test-battery -j 12
Run only tests under "tests/cyclers/"
  cylc test-battery tests/cyclers
Run only "tests/cyclers/16-weekly.t" in verbose mode
  cylc test-battery -v tests/cyclers/16-weekly.t
Run only tests under "tests/cyclers/", and skip 00-daily.t
  export CYLC_TEST_SKIP=tests/cyclers/00-daily.t
  cylc test-battery tests/cyclers
Run the first quarter of the test battery
  cylc test-battery --chunk '1/4'
Re-run failed tests
  cylc test-battery --state=save
  cylc test-battery --state=failed
15.6.3.71. trigger
   Usage: cylc [control] trigger [OPTIONS] REG [TASK_GLOB ...]

Manually trigger tasks.
  cylc trigger REG - trigger all tasks in a running suite
  cylc trigger REG TASK_GLOB ... - trigger one or more tasks in a running suite

NOTE waiting tasks that are queue-limited will be queued if triggered, to
submit as normal when released by the queue; queued tasks will submit
immediately if triggered, even if that violates the queue limit (so you may
need to trigger a queue-limited task twice to get it to submit immediately).

For single tasks you can use "--edit" to edit the generated job script before
it submits, to apply one-off changes. A diff between the original and edited
job script will be saved to the task job log directory.

Multiple TASK_GLOBs can be given. They each match task proxy instances in the
current task pool by task or family name pattern, cycle point pattern, and task
state. They do NOT match any task at any point in the abstract suite graph; if
target task instances do not exist in the current pool you must insert them
first with the "cylc insert" command.
* [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
* [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
* TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
* FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]

For example, to match:
* all tasks in a cycle: '20200202T0000Z/*' or '*.20200202T0000Z'
* all tasks in the submitted status: ':submitted'
* retrying 'foo*' tasks in 0000Z cycles: 'foo*.*0000Z:retrying' or
  '*0000Z/foo*:retrying'
* retrying tasks in 'BAR' family: '*/BAR:retrying' or 'BAR.*:retrying'
* retrying tasks in 'BAR' or 'BAZ' families: '*/BA[RZ]:retrying' or
  'BA[RZ].*:retrying'

The old 'MATCH POINT' syntax will be automatically detected and supported. To
avoid this, use the '--no-multitask-compat' option, or use the new syntax
(with a '/' or a '.') when specifying 2 TASK_GLOB arguments.

Arguments:
   REG                           Suite name
   [TASK_GLOB ...]               Task match pattern(s)

Options:
  -h, --help            show this help message and exit
  -e, --edit            Manually edit the job script before running it.
  -g, --geditor         (with --edit) force use of the configured GUI editor.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --port=INT            Suite port number on the suite host. NOTE: this is
                        retrieved automatically if non-interactive ssh is
                        configured to the suite host.
  --use-ssh             Use ssh to re-invoke the command on the suite host.
  --ssh-cylc=SSH_CYLC   Location of cylc executable on remote ssh commands.
  --no-login            Do not use a login shell to run remote ssh commands.
                        The default is to use a login shell.
  --comms-timeout=SEC, --pyro-timeout=SEC
                        Set a timeout for network connections to the running
                        suite. The default is no timeout. For task messaging
                        connections see site/user config file documentation.
  --print-uuid          Print the client UUID to stderr. This can be matched
                        to information logged by the receiving suite server
                        program.
  --set-uuid=UUID       Set the client UUID manually (e.g. from prior use of
                        --print-uuid). This can be used to log multiple
                        commands under the same UUID (but note that only the
                        first [info] command from the same client ID will be
                        logged unless the suite is running in debug mode).
  -f, --force           Do not ask for confirmation before acting. Note that
                        it is not necessary to use this option if interactive
                        command prompts have been disabled in the site/user
                        config files.
  -m, --family          (Obsolete) This option is now ignored and is retained
                        for backward compatibility only. TASK_GLOB in the
                        argument list can be used to match task and family
                        names regardless of this option.
  --no-multitask-compat
                        Disallow backward compatible multitask interface.
15.6.3.72. upgrade-run-dir
   Usage: cylc [admin] upgrade-run-dir SUITE

For one-off conversion of a suite run directory to cylc-6 format.

Arguments:
     SUITE    suite name or run directory path

Options:
  -h, --help  show this help message and exit
15.6.3.73. validate
   Usage: cylc [prep] validate [OPTIONS] SUITE

Validate a suite definition.

If the suite definition uses include-files reported line numbers
will correspond to the inlined version seen by the parser; use
'cylc view -i,--inline SUITE' for comparison.

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help            show this help message and exit
  --strict              Fail any use of unsafe or experimental features.
                        Currently this just means naked dummy tasks (tasks
                        with no corresponding runtime section) as these may
                        result from unintentional typographic errors in task
                        names.
  -o FILENAME, --output=FILENAME
                        Specify a file name to dump the processed suite.rc.
  --profile             Output profiling (performance) information
  -u RUN_MODE, --run-mode=RUN_MODE
                        Validate for run mode.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
  --initial-cycle-point=CYCLE_POINT, --initial-point=CYCLE_POINT, --icp=CYCLE_POINT, --ict=CYCLE_POINT
                        Set the initial cycle point. Required if not defined
                        in suite.rc.
15.6.3.74. view
   Usage: cylc [prep] view [OPTIONS] SUITE

View a read-only temporary copy of suite NAME's suite.rc file, in your
editor, after optional include-file inlining and Jinja2 preprocessing.

The edit process is spawned in the foreground as follows:
  % <editor> suite.rc
Where <editor> can be set in cylc global config.

For remote host or owner, the suite will be printed to stdout unless
the '-g,--gui' flag is used to spawn a remote GUI edit session.

See also 'cylc [prep] edit'.

Arguments:
   SUITE               Suite name or path

Options:
  -h, --help            show this help message and exit
  -i, --inline          Inline include-files.
  -e, --empy            View after EmPy template processing (implies
                        '-i/--inline' as well).
  -j, --jinja2          View after Jinja2 template processing (implies
                        '-i/--inline' as well).
  -p, --process         View after all processing (EmPy, Jinja2, inlining,
                        line-continuation joining).
  -m, --mark            (With '-i') Mark inclusions in the left margin.
  -l, --label           (With '-i') Label file inclusions with the file name.
                        Line numbers will not correspond to those reported by
                        the parser.
  --single              (With '-i') Inline only the first instances of any
                        multiply-included files. Line numbers will not
                        correspond to those reported by the parser.
  -c, --cat             Concatenate continuation lines (line numbers will not
                        correspond to those reported by the parser).
  -g, --gui             Force use of the configured GUI editor.
  --stdout              Print the suite definition to stdout.
  --mark-for-edit       (With '-i') View file inclusion markers as for 'cylc
                        edit --inline'.
  --user=USER           Other user account name. This results in command
                        reinvocation on the remote account.
  --host=HOST           Other host name. This results in command reinvocation
                        on the remote account.
  -v, --verbose         Verbose output mode.
  --debug               Output developer information and show exception
                        tracebacks.
  --suite-owner=OWNER   Specify suite owner
  -s NAME=VALUE, --set=NAME=VALUE
                        Set the value of a Jinja2 template variable in the
                        suite definition. This option can be used multiple
                        times on the command line. NOTE: these settings
                        persist across suite restarts, but can be set again on
                        the "cylc restart" command line if they need to be
                        overridden.
  --set-file=FILE       Set the value of Jinja2 template variables in the
                        suite definition from a file containing NAME=VALUE
                        pairs (one per line). NOTE: these settings persist
                        across suite restarts, but can be set again on the
                        "cylc restart" command line if they need to be
                        overridden.
15.6.3.75. warranty
   Usage: cylc [license] warranty [--help]
   Cylc is released under the GNU General Public License v3.0
This command prints the GPL v3.0 disclaimer of warranty.
Options:
  --help   Print this usage message.

15.7. The gcylc Graph View

The graph view in the gcylc GUI shows the structure of the suite as it evolves. It can work well even for large suites, but be aware that the Graphviz layout engine has to do a new global layout every time a task proxy appears in or disappears from the task pool. The following may help mitigate any jumping layout problems:

  • The disconnect button can be used to temporarily prevent the graph from changing as the suite evolves.
  • The greyed-out base nodes, which are only present to fill out the graph structure, can be toggled off (but this will split the graph into disconnected sub-trees).
  • Right-click on a task and choose the “Focus” option to restrict the graph display to that task’s cycle point. Anything interesting happening in other cycle points will show up as disconnected rectangular nodes to the right of the graph (and you can click on those to instantly refocus to their cycle points).
  • Task filtering is the ultimate quick route to focusing on just the tasks you’re interested in, but this will destroy the graph structure.

15.8. Cylc README File

# The Cylc Workflow Engine

[![Build Status](https://travis-ci.org/cylc/cylc.svg?branch=master)](https://travis-ci.org/cylc/cylc)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/1d6a97bf05114066ae30b63dcb0cdcf9)](https://www.codacy.com/app/Cylc/cylc?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=cylc/cylc&amp;utm_campaign=Badge_Grade)
[![codecov](https://codecov.io/gh/cylc/cylc/branch/master/graph/badge.svg)](https://codecov.io/gh/cylc/cylc)
[![DOI](https://zenodo.org/badge/1836229.svg)](https://zenodo.org/badge/latestdoi/1836229)
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00737/status.svg)](https://doi.org/10.21105/joss.00737)

Cylc (“silk”) orchestrates complex distributed suites of interdependent cycling
(or non-cycling) tasks. It was originally designed to automate environmental
forecasting systems at [NIWA](https://www.niwa.co.nz). Cylc is a general
workflow engine, however; it is not specialized to forecasting in any way.

[Quick Installation](INSTALL.md) |
[Web Site](https://cylc.github.io/) |
[Documentation](https://cylc.github.io/documentation) |
[Contributing](CONTRIBUTING.md)

### Copyright and Terms of Use

Copyright (C) NIWA & British Crown (Met Office) & Contributors.
 
Cylc is free software: you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software Foundation,
either version 3 of the License, or (at your option) any later version.
 
Cylc is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.  See the GNU General Public License for more details.
 
You should have received a copy of the GNU General Public License along with
Cylc.  If not, see [GNU licenses](http://www.gnu.org/licenses/).

## Cylc Documentation
 * See [The Cylc Web Site](https://cylc.github.io/)

## Acknowledgement for non-Cylc Work

See [Acknowledgement for Non-Cylc Work](ACKNOWLEDGEMENT.md).

15.9. Cylc INSTALL File

# Cylc: Quick Installation Guide

**See [The Cylc User Guide](https://cylc.github.io/documentation.html) for
more detailed information.**

Cylc must be installed on suite and task job hosts. Third-party dependencies
(below) are not required on job hosts.

### Python 2 or Python 3 ?

Currently in the source code repository:
- **master branch:** Python 3, ZeroMQ network layer, **no GUI** - Cylc-8 Work In Progress
- **7.8.x branch:** Python 2, Cherrypy network layer, PyGTK GUI - Cylc-7 Maintenance

The first official Cylc-8 release (with a new web UI) is not expected until late 2019.
Until then we recommend the latest cylc-7.8 release for production use.

**THIS IS THE 7.8.x (PYTHON 2) INSTALL.md**

### Third-party Software Packages

Install the packages listed in the **Installation** section of the User Guide.
See also *Check Software Installation* below.

### Installing Cylc

Download the latest tarball from [Cylc
Releases](https://github.com/cylc/cylc-flow/releases).

Successive Cylc releases should be installed side-by-side under a location
such as `/opt`:

```bash
cd /opt
tar xzf cylc-7.8.1.tar.gz
# DO NOT CHANGE THE NAME OF THE UNPACKED CYLC SOURCE DIRECTORY.
cd cylc-7.8.1
export PATH=$PWD/bin:$PATH
make
```

Then make (or update) a symlink to the latest installed version:
```bash
ln -s /opt/cylc-7.8.1 /opt/cylc
```

When you type `make`:
  * The Cylc User Guide is generated from source (if you have sphinx-doc installed).

If this is the first installed version of Cylc, copy the wrapper script
`usr/bin/cylc` to a location in the system executable path, such as
`/usr/bin/` or `/usr/local/bin/`, and edit it - as per the in-file
instructions - to point to the Cylc install location:

```bash
cp /opt/cylc-7.8.1/usr/bin/cylc /usr/local/bin/
# (and EDIT /usr/local/bin/cylc as instructed)
```

The wrapper is designed invoke the latest (symlinked) version of Cylc by
default, or else a particular version determined by `$CYLC_VERSION` or
`$CYLC_HOME` in your environment. This is how a long-running suite server
program ensures that the jobs it manages invoke clients at the right cylc
version.

### Check Software Installation

```
$ cylc check-software
Checking your software...
...

15.10. Cylc Development History - Major Changes

  • pre-cylc-3
    • early versions focused on the new scheduling algorithm. A suite was a collection of “task definition files” that encoded the prerequisites and outputs of each task, exposing cylc’s self-organising nature. Tasks could be transferred from one suite to another by simply copying their taskdef files over and checking prerequisite and output consistency. Global suite structure was not easy to discern until run time (although cylc-2 could generate resolved run time dependency graphs).
  • cylc-3
    • a new suite design interface: dependency graph and task runtime properties defined in a single structured, validated, configuration file - the suite.rc file
    • graphical user interface
    • suite graphing.
  • cylc-4
    • refined and organized the suite.rc file structure
    • task runtime properties defined by an efficient inheritance hierarchy
    • support for the Jinja2 template processor in suite configurations.
  • cylc-5
    • multi-threading for continuous network request handling and job submission
    • more task states to distinguish job submission from execution
    • dependence between suites via new suite run databases
    • polling and killing of real task jobs
    • polling as task communications option.
  • cylc-6
    • specification of all date-times and cycling workflows via the ISO8601 date-times, durations, and recurrence expressions
    • integer cycling
    • a multi-process pool to execute job submissions, event handlers, and poll and kill commands.
  • cylc-7
    • Replaced the Pyro communications layer with RESTful HTTPS
    • Removed deprecated pre cylc-6 syntax and features.

15.11. Cylc 6 Migration Reference

Cylc 6 introduced new date-time-related syntax for the suite.rc file. In some places, this is quite radically different from the earlier syntax.

15.11.1. Timeouts and Delays

Timeouts and delays such as [cylc][[events]]timeout or [runtime][[my_task]][[[job]]]execution retry delays were written in a purely numeric form before cylc 6, in seconds, minutes (most common), or hours, depending on the setting.

They are now written in an ISO 8601 duration form, which has the benefit that the units are user-selectable (use 1 day instead of 1440 minutes) and explicit.

Nearly all timeouts and delays in cylc were in minutes, except for:

[runtime][[my_task]][[[suite state polling]]]interval
[runtime][[my_task]][[[simulation mode]]]run time range

which were in seconds, and

[scheduling]runahead limit

which was in hours (this is a special case discussed below in Runahead Limit).

See Table X.

Table 3 Timeout/Delay Syntax Change Examples
Setting Pre-Cylc-6 Cylc-6+
[cylc][[events]]timeout 180 PT3H
[runtime][[my_task]][[[job]]]execution retry delays 2*30, 360, 1440 2*PT30M, PT6H, P1D
[runtime][[my_task]][[[suite state polling]]]interval 2 PT2S

15.11.2. Runahead Limit

See [scheduling] -> runahead limit.

The [scheduling]runahead limit setting was written as a number of hours in pre-cylc-6 suites. This is now in ISO 8601 format for date-time cycling suites, so [scheduling]runahead limit=36 would be written [scheduling]runahead limit=PT36H.

There is a new preferred alternative to runahead limit, [scheduling]max active cycle points. This allows the user to configure how many cycle points can run at once (default 3). See [scheduling] -> max active cycle points.

15.11.3. Cycle Time/Cycle Point

See [scheduling] -> initial cycle point.

The following suite.rc settings have changed name (Table X):

Table 4 Cycle Point Renaming
Pre-Cylc-6 Cylc-6+
[scheduling]initial cycle time [scheduling]initial cycle point
[scheduling]final cycle time [scheduling]final cycle point
[visualization]initial cycle time [visualization]initial cycle point
[visualization]final cycle time [visualization]final cycle point

This change is to reflect the fact that cycling in cylc 6+ can now be over e.g. integers instead of being purely based on date-time.

Date-times written in initial cycle time and final cycle time were in a cylc-specific 10-digit (or less) CCYYMMDDhh format, such as 2014021400 for 00:00 on the 14th of February 2014.

Date-times are now required to be ISO 8601 compatible. This can be achieved easily enough by inserting a T between the day and the hour digits.

Table 5 Cycle Point Syntax Example
Setting Pre-Cylc-6 Cylc-6+
[scheduling]initial cycle time 2014021400 20140214T00

15.11.4. Cycling

Special start-up and cold-start tasks have been removed from cylc 6. Instead, use the initial/run-once notation as detailed in Initial Non-Repeating (R1) Tasks and Advanced Starting Up.

Repeating asynchronous tasks have also been removed because non date-time workflows can now be handled more easily with integer cycling. See for instance the satellite data processing example documented in Integer Cycling.

For repeating tasks with hour-based cycling the syntax has only minor changes:

Pre-cylc-6:

[scheduling]
    # ...
    [[dependencies]]
        [[[0,12]]]
            graph = foo[T-12] => foo & bar => baz

Cylc-6+:

[scheduling]
    # ...
    [[dependencies]]
        [[[T00,T12]]]
            graph = foo[-PT12H] => foo & bar => baz

Hour-based cycling section names are easy enough to convert, as seen in Table X.

Table 6 Hourly Cycling Sections
Pre-Cylc-6 Cylc-6+
[scheduling][[dependencies]][[[0]]] [scheduling][[dependencies]][[[T00]]]
[scheduling][[dependencies]][[[6]]] [scheduling][[dependencies]][[[T06]]]
[scheduling][[dependencies]][[[12]]] [scheduling][[dependencies]][[[T12]]]
[scheduling][[dependencies]][[[18]]] [scheduling][[dependencies]][[[T18]]]

The graph text in hour-based cycling is also easy to convert, as seen in Table X.

Table 7 Hourly Cycling Offsets
Pre-Cylc-6 Cylc-6+
my_task[T-6] my_task[-PT6H]
my_task[T-12] my_task[-PT12H]
my_task[T-24] my_task[-PT24H] or even my_task[-P1D]

15.11.5. No Implicit Creation of Tasks by Offset Triggers

Prior to cylc-6 intercycle offset triggers implicitly created task instances at the offset cycle points. For example, this pre cylc-6 suite automatically creates instances of task foo at the offset hours 3,9,15,21 each day, for task bar to trigger off at 0,6,12,18:

# Pre cylc-6 implicit cycling.
[scheduling]
   initial cycle time = 2014080800
   [[dependencies]]
      [[[00,06,12,18]]]
         # This creates foo instances at 03,09,15,21:
         graph = foo[T-3] => bar

Here’s the direct translation to cylc-6+ format:

# In cylc-6+ this suite will stall.
[scheduling]
   initial cycle point = 20140808T00
   [[dependencies]]
      [[[T00,T06,T12,T18]]]
         # This does NOT create foo instances at 03,09,15,21:
         graph = foo[-PT3H] => bar

This suite fails validation with ERROR: No cycling sequences defined for foo, and at runtime it would stall with bar instances waiting on non-existent offset foo instances (note that these appear as ghost nodes in graph visualisations).

To fix this, explicitly define the cycling of with an offset cycling sequence foo:

# Cylc-6+ requires explicit task instance creation.
[scheduling]
   initial cycle point = 20140808T00
   [[dependencies]]
      [[[T03,T09,T15,T21]]]
         graph = foo
      [[[T00,T06,T12,T18]]]
         graph = foo[-PT3H] => bar

Implicit task creation by offset triggers is no longer allowed because it is error prone: a mistaken task cycle point offset should cause a failure rather than automatically creating task instances on the wrong cycling sequence.

15.12. Known Issues

15.12.1. Current Known Issues

The best place to find current known issues is on GitHub.

15.12.2. Notable Known Issues

15.12.2.1. Use of pipes in job scripts

In bash, the return status of a pipeline is normally the exit status of the last command. This is unsafe, because if any command in the pipeline fails, the script will continue nevertheless.

For safety, a cylc task job script running in bash will have the set -o pipefail option turned on automatically. If a pipeline exists in a task’s script, etc section, the failure of any part of a pipeline will cause the command to return a non-zero code at the end, which will be reported as a task job failure. Due to the unique nature of a pipeline, the job file will trap the failure of the individual commands, as well as the whole pipeline, and will attempt to report a failure back to the suite twice. The second message is ignored by the suite, and so the behaviour can be safely ignored. (You should probably still investigate the failure, however!)

15.13. GNU GENERAL PUBLIC LICENSE v3.0

See the GNU GENERAL PUBLIC LICENSE v3.0.

16. Suite Design Guide

Cylc Rose Suite Design Best Practice Guide

Version 1.0 - 23 March 2017

Last updated for: Cylc-7.2.0 and Rose-2017.02.0

Hilary Oliver, Dave Matthews, Andy Clark, and Contributors


16.1. Introduction

This document provides guidance on making complex Cylc + Rose workflows that are clear, maintainable, and portable. Note that best practice advice may evolve over time with the capabilities of Rose and Cylc.

Content is drawn from the Rose and Cylc user guides, earlier Met Office suite design and operational suite review documents, experience with real suites across the Unified Model Consortium, and discussion among members of the UM TISD (Technical Infrastructure Suite Design) working group.

We start with the most general topics (coding style, general principles), move on to more advanced topics (efficiency and maintainability, portable suites), and end with some pointers to future developments.

Note

A good working knowledge of Cylc and Rose is assumed. For further details, please consult the:

Note

For non-Rose users: this document comes out of the Unified Model Consortium wherein Cylc is used within the Rose suite management framework. However, the bulk of the information in this guide is about Cylc suite design; which parts are Rose-specific should be clear from context.

16.2. Style Guidelines

Coding style is largely subjective, but for collaborative development of complex systems it is important to settle on a clear and consistent style to avoid getting into a mess. The following style rules are recommended.

16.2.1. Tab Characters

Do not use tab characters. Tab width depends on editor settings, so a mixture of tabs and spaces in the same file can render to a mess.

Use grep -InPr "\t" * to find tabs recursively in files in a directory.

In vim use %retab to convert existing tabs to spaces, and set expandtab to automatically convert new tabs.

In emacs use whitespace-cleanup.

In gedit, use the Draw Spaces plugin to display tabs and spaces.

16.2.2. Trailing Whitespace

Trailing whitespace is untidy, it makes quick reformatting of paragraphs difficult, and it can result in hard-to-find bugs (space after intended line continuation markers).

To remove existing trailing whitespace in a file use a sed or perl one-liner:

$ perl -pi -e "s/ +$//g" /path/to/file
# or:
$ sed --in-place 's/[[:space:]]\+$//' path/to/file

Or do a similar search-and-replace operation in your editor. Editors like vim and emacs can also be configured to highlight or automatically remove trailing whitespace on the fly.

16.2.3. Indentation

Consistent indentation makes a suite definition more readable, it shows section nesting clearly, and it makes block re-indentation operations easier in text editors. Indent suite.rc syntax four spaces per nesting level:

16.2.3.1. Config Items
[SECTION]
    # A comment.
    title = the quick brown fox
    [[SUBSECTION]]
        # Another comment.
        a short item = value1
        a very very long item = value2

Don’t align item = value pairs on the = character like this:

[SECTION]  # Avoid this.
             a short item = value1
    a very very long item = value2

or like this:

[SECTION]  # Avoid this.
    a short item          = value1
    a very very long item = value2

because the whole block may need re-indenting after a single change, which will pollute your revision history with spurious changes.

Comments should be indented to the same level as the section or item they refer to, and trailing comments should be preceded by two spaces, as shown above.

16.2.3.2. Script String Lines

Script strings are written verbatim to task job scripts so they should really be indented from the left margin:

[runtime]
    [[foo]]
        # Recommended.
        post-script = """
if [[ $RESULT == "bad" ]]; then
    echo Goodbye World!
    exit 1
fi"""

Indentation is mostly ignored by the bash interpreter, but is useful for readability. It is mostly harmless to indent internal script lines as if part of the Cylc syntax, or even out to the triple quotes:

[runtime]
    [[foo]]
        # OK, but...
        post-script = """
            if [[ $RESULT == "bad" ]]; then
                echo Goodbye World!
                exit 1
            fi"""

On parsing the triple quoted value, Cylc will remove any common leading whitespace from each line using the logic of Python’s textwrap.dedent so the script block would end up being the same as the previous example. However, you should watch your line length (see Line Length And Continuation) when you have many levels of indentations.

Note

Take care when indenting here documents:

[runtime]
    [[foo]]
     script = """
     cat >> log.txt <<_EOF_
         The quick brown fox jumped
         over the lazy dog.
     _EOF_
              """

In the above, each line in log.txt would end up with 4 leading white spaces. The following will give you lines with no white spaces.

[runtime]
    [[foo]]
        script = """
        cat >> log.txt <<_EOF_
        The quick brown fox jumped
        over the lazy dog.
        _EOF_
                 """
16.2.3.3. Graph String Lines

Multiline graph strings can be entirely free-form:

[scheduling]
    [[dependencies]]
        graph = """
    # Main workflow:
  FAMILY:succeed-all => bar & baz => qux

    # Housekeeping:
  qux => rose_arch => rose_prune"""

Whitespace is ignored in graph string parsing, however, so internal graph lines can be indented as if part of the suite.rc syntax, or even out to the triple quotes, if you feel it aids readability (but watch line length with large indents; see Line Length And Continuation):

[scheduling]
    [[dependencies]]
        graph = """
            # Main workflow:
            FAMILY:succeed-all => bar & baz => qux

            # Housekeeping:
            qux => rose_arch => rose_prune"""

Both styles are acceptable; choose one and use it consistently.

16.2.3.4. Jinja2 Code

A suite.rc file with embedded Jinja2 code is essentially a Jinja2 program to generate a Cylc suite definition. It is not possible to consistently indent the Jinja2 as if it were part of the suite.rc syntax (which to the Jinja2 processor is just arbitrary text), so it should be indented from the left margin on its own terms:

[runtime]
    [[OPS]]
{% for T in OPS_TASKS %}
    {% for M in range(M_MAX) %}
    [[ops_{{T}}_{{M}}]]
        inherit = OPS
    {% endfor %}
{% endfor %}

16.2.4. Comments

Comments should be minimal, but not too minimal. If context and clear task and variable names will do, leave it at that. Extremely verbose comments tend to get out of sync with the code they describe, which can be worse than having no comments.

Avoid long lists of numbered comments - future changes may require mass renumbering.

Avoid page-width “section divider” comments, especially if they are not strictly limited to the standard line length (see Line Length And Continuation).

Indent comments to the same level as the config items they describe.

16.2.5. Titles, Descriptions, And URLs

Document the suite and its tasks with title, description, and url items instead of comments. These can be displayed, or linked to, by the GUI at runtime.

16.2.6. Line Length And Continuation

Keep to the standard maximum line length of 79 characters where possible. Very long lines affect readability and make side-by-side diffs hard to view.

Backslash line continuation markers can be used anywhere in the suite.rc file but should be avoided if possible because they are easily broken by invisible trailing whitespace.

Continuation markers are not needed in graph strings where trailing trigger arrows imply line continuation:

[scheduling]
    [[dependencies]]
        # No line continuation marker is needed here.
        graph = """prep => one => two => three =>
                four => five six => seven => eight"""
[runtime]
    [[MY_TASKS]]
    # A line continuation marker *is* needed here:
    [[one, two, three, four, five, six, seven, eight, nine, ten, \
      eleven, twelve, thirteen ]]
        inherit = MY_TASKS

16.2.7. Task Naming Conventions

Use UPPERCASE for family names and lowercase for tasks, so you can distinguish them at a glance.

Choose a convention for multi-component names and use it consistently. Put the most general name components first for natural grouping in the GUI, e.g. obs_sonde, obs_radar (not sonde_obs etc.)

Within your convention keep names as short as possible.

16.2.7.1. UM System Task Names

For UM System suites we recommend the following full task naming convention:

model_system_function[_member]

For example, glu_ops_process_scatwind where glu refers to the global (deterministic model) update run, ops is the system that owns the task, and process_scatwind is the function it performs. The optional member suffix is intended for use with ensembles as needed.

Within this convention keep names as short as possible, e.g. use fcst instead of forecast.

UM forecast apps should be given names that reflect their general science configuration rather than geographic domain, to allow use on other model domains without causing confusion.

16.2.8. Rose Config Files

Use rose config-dump to load and re-save new Rose .conf files. This puts the files in a standard format (ordering of lines etc.) to ensure that spurious changes aren’t generated when you next use rose edit.

See also Optional App Config Files on optional app config files.

16.3. Basic Principles

This section covers general principles that should be kept in mind when writing any suite. More advanced topics are covered later: Efficiency And Maintainability and Portable Suites.

16.3.1. UTC Mode

Cylc has full timezone support if needed, but real time NWP suites should use UTC mode to avoid problems at the transition between local standard time and daylight saving time, and to enable the same suite to run the same way in different timezones.

[cylc]
    UTC mode = True

16.3.2. Fine Or Coarse-Grained Suites

Suites can have many small simple tasks, fewer large complex tasks, or anything in between. A task that runs many distinct processes can be split into many distinct tasks. The fine-grained approach is more transparent and it allows more task level concurrency and quicker failure recovery - you can rerun just what failed without repeating anything unnecessarily.

16.3.2.1. rose bunch

One caveat to our fine-graining advice is that submitting a large number of small tasks at once may be a problem on some platforms. If you have many similar concurrent jobs you can use rose bunch to pack them into a single task with incremental rerun capability: retriggering the task will rerun just the component jobs that did not successfully complete earlier.

16.3.3. Monolithic Or Interdependent Suites

When writing suites from scratch you may need to decide between putting multiple loosely connected sub-workflows into a single large suite, or constructing a more modular system of smaller suites that depend on each other through inter-suite triggering. Each approach has its pros and cons, depending on your requirements and preferences with respect to the complexity and manageability of the resulting system.

The cylc gscan GUI lets you monitor multiple suites at a time, and you can define virtual groups of suites that collapse into a single state summary.

16.3.3.1. Inter-Suite Triggering

A task in one suite can explicitly trigger off of a task in another suite. The full range of possible triggering conditions is supported, including custom message triggers. Remote triggering involves repeatedly querying (“polling”) the remote suite run database, not the suite server program, so it works even if the other suite is down at the time.

There is special graph syntax to support triggering off of a task in another suite, or you can call the underlying cylc suite-state command directly in task scripting.

In real time suites you may want to use clock-triggers to delay the onset of inter-suite polling until roughly the expected completion time of the remote task.

16.3.4. Self-Contained Suites

All files generated by Cylc during a suite run are confined to the suite run directory $HOME/cylc-run/<SUITE>. However, Cylc has no control over the locations of the programs, scripts, and files, that are executed, read, or generated by your tasks at runtime. It is up to you to ensure that all of this is confined to the suite run directory too, as far as possible.

Self-contained suites are more robust, easier to work with, and more portable. Multiple instances of the same suite (with different suite names) should be able to run concurrently under the same user account without mutual interference.

16.3.4.1. Avoiding External Files

Suites that use external scripts, executables, and files beyond the essential system libraries and utilities are vulnerable to external changes: someone else might interfere with these files without telling you.

In some case you may need to symlink to large external files anyway, if space or copy speed is a problem, but otherwise suites with private copies of all the files they need are more robust.

16.3.4.2. Installing Files At Start-up

Use rose suite-run file creation mode or R1 install tasks to copy files to the self-contained suite run directory at start-up. Install tasks are preferred for time-consuming installations because they don’t slow the suite start-up process, they can be monitored in the GUI, they can run directly on target platforms, and you can rerun them later without restarting the suite. If you are using symbolic links to install files under your suite directory it is recommended that the linking should be set up to fail if the source is missing e.g. by using mode=symlink+ for file installation in a rose app.

16.3.4.3. Confining Ouput To The Run Directory

Output files should be confined to the suite run directory tree. Then all output is easy to find, multiple instances of the same suite can run concurrently without interference, and other users should be able to copy and run your suite with few modifications. Cylc provides a share directory for generated files that are used by several tasks in a suite (see Shared Task IO Paths). Archiving tasks can use rose arch to copy or move selected files to external locations as needed (see Suite Housekeeping).

16.3.5. Task Host Selection

At sites with multiple task hosts to choose from, use rose host-select to dynamically select appropriate task hosts rather than hard coding particular hostnames. This enables your suite to adapt to particular machines being down or heavily overloaded by selecting from a group of hosts based on a series of criteria. rose host-select will only return hosts that can be contacted by non-interactive SSH.

16.3.6. Task Scripting

Non-trivial task scripting should be held in external files rather than inlined in the suite.rc. This keeps the suite definition tidy, and it allows proper shell-mode text editing and independent testing of task scripts.

For automatic access by task jobs, task-specific scripts should be kept in Rose app bin directories, and shared scripts kept in (or installed to) the suite bin directory.

16.3.6.1. Coding Standards

When writing your own task scripts make consistent use of appropriate coding standards such as:

16.3.6.2. Basic Functionality

In consideration of future users who may not be expert on the internals of your suite and its tasks, all task scripts should:

  • Print clear usage information if invoked incorrectly (and via the standard options -h, --help).
  • Print useful diagnostic messages in case of error. For example, if a file was not found, the error message should contain the full path to the expected location.
  • Always return correct shell exit status - zero for success, non-zero for failure. This is used by Cylc job wrapper code to detect success and failure and report it back to the suite server program.
  • In shell scripts use set -u to abort on any reference to an undefined variable. If you really need an undefined variable to evaluate to an empty string, make it explicit: FOO=${FOO:-}.
  • In shell scripts use set -e to abort on any error without having to failure-check each command explicitly.
  • In shell scripts use set -o pipefail to abort on any error within a pipe line. Note that all commands in the pipe line will still run, it will just exit with the right most non-zero exit status.

Note

Examples and more details are available for the above three set commands.

16.3.7. Rose Apps

Rose apps allow all non-shared task configuration - which is not relevant to workflow automation - to be moved from the suite definition into app config files. This makes suites tidier and easier to understand, and it allows rose edit to provide a unified metadata-enhanced view of the suite and its apps (see Rose Metadata Compliance).

Rose apps are a clear winner for tasks with complex configuration requirements. It matters less for those with little configuration, but for consistency and to take full advantage of rose edit it makes sense to use Rose apps for most tasks.

When most tasks are Rose apps, set the app-run command as a root-level default, and override it for the occasional non Rose app task:

[runtime]
    [[root]]
        script = rose task-run -v
    [[rose-app1]]
        #...
    [[rose-app2]]
        #...
    [[hello-world]]  # Not a Rose app.
        script = echo "Hello World"

16.3.8. Rose Metadata Compliance

Rose metadata drives page layout and sort order in rose edit, plus help information, input validity checking, macros for advanced checking and app version upgrades, and more.

To ensure the suite and its constituent applications are being run as intended it should be valid against any provided metadata: launch the rose edit GUI or run rose macro --validate on the command line to highlight any errors, and correct them prior to use. If errors are flagged incorrectly you should endeavour to fix the metadata.

When writing a new suite or application, consider creating metadata to facilitate ease of use by others.

16.3.9. Task Independence

Essential dependencies must be encoded in the suite graph, but tasks should not rely unnecessarily on the action of other tasks. For example, tasks should create their own output directories if they don’t already exist, even if they would normally be created by an earlier task in the workflow. This makes it is easier to run tasks alone during development and testing.

16.3.10. Clock-Triggered Tasks

Tasks that wait on real time data should use clock-triggers to delay job submission until the expected data arrival time:

[scheduling]
    initial cycle point = now
    [[special tasks]]
        # Trigger 5 min after wall-clock time is equal to cycle point.
        clock-trigger = get-data(PT5M)
    [[dependencies]]
        [[[T00]]]
            graph = get-data => process-data

Clock-triggered tasks typically have to handle late data arrival. Task execution retry delays can be used to simply retrigger the task at intervals until the data is found, but frequently retrying small tasks probably should not go to a batch scheduler, and multiple task failures will be logged for what is a essentially a normal condition (at least it is normal until the data is really late).

Rather than using task execution retry delays to repeatedly trigger a task that checks for a file, it may be better to have the task itself repeatedly poll for the data (see Rose App File Polling for example).

16.3.11. Rose App File Polling

Rose apps have built-in polling functionality to check repeatedly for the existence of files before executing the main app. See the [poll] section in Rose app config documentation. This is a good way to implement check-and-wait functionality in clock-triggered tasks (Clock-Triggered Tasks), for example.

It is important to note that frequent polling may be bad for some filesystems, so be sure to configure a reasonable interval between polls.

16.3.12. Task Execution Time Limits

Instead of setting job wall clock limits directly in batch scheduler directives, use the execution time limit suite config item. Cylc automatically derives the correct batch scheduler directives from this, and it is also used to run background and at jobs via the timeout command, and to poll tasks that haven’t reported in finished by the configured time limit.

16.3.13. Restricting Suite Activity

It may be possible for large suites to overwhelm a job host by submitting too many jobs at once:

  • Large suites that are not sufficiently limited by real time clock triggering or inter-cycle dependence may generate a lot of runahead (this refers to Cylc’s ability to run multiple cycles at once, restricted only by the dependencies of individual tasks).
  • Some suites may have large families of tasks whose members all become ready at the same time.

These problems can be avoided with runahead limiting and internal queues, respectively.

16.3.13.1. Runahead Limiting

By default Cylc allows a maximum of three cycle points to be active at the same time, but this value is configurable:

[scheduling]
    initial cycle point = 2020-01-01T00
    # Don't allow any cycle interleaving:
    max active cycle points = 1
16.3.13.2. Internal Queues

Tasks can be assigned to named internal queues that limit the number of members that can be active (i.e. submitted or running) at the same time:

[scheduling]
    initial cycle point = 2020-01-01T00
    [[queues]]
        # Allow only 2 members of BIG_JOBS to run at once:
        [[[big_jobs_queue]]]
            limit = 2
            members = BIG_JOBS
    [[dependencies]]
        [[[T00]]]
            graph = pre => BIG_JOBS
[runtime]
    [[BIG_JOBS]]
    [[foo, bar, baz, ...]]
        inherit = BIG_JOBS

16.3.14. Suite Housekeeping

Ongoing cycling suites can generate an enormous number of output files and logs so regular housekeeping is very important. Special housekeeping tasks, typically the last tasks in each cycle, should be included to archive selected important files and then delete everything at some offset from the current cycle point.

The Rose built-in apps rose_arch and rose_prune provide an easy way to do this. They can be configured easily with file-matching patterns and cycle point offsets to perform various housekeeping operations on matched files.

16.3.15. Complex Jinja2 Code

The Jinja2 template processor provides general programming constructs, extensible with custom Python filters, that can be used to generate the suite definition. This makes it possible to write flexible multi-use suites with structure and content that varies according to various input switches. There is a cost to this flexibility however: excessive use of Jinja2 can make a suite hard to understand and maintain. It is difficult to say exactly where to draw the line, but we recommend erring on the side of simplicity and clarity: write suites that are easy to understand and therefore easy to modify for other purposes, rather than extremely complicated suites that attempt do everything out of the box but are hard to maintain and modify.

Note that use of Jinja2 loops for generating tasks is now deprecated in favour of built-in parameterized tasks - see Parameterized Tasks.

16.3.16. Shared Configuration

Configuration that is common to multiple tasks should be defined in one place and used by all, rather than duplicated in each task. Duplication is a maintenance risk because changes have to be made consistently in several places at once.

16.3.16.1. Jinja2 Variables

In simple cases you can share by passing a Jinja2 variable to all the tasks that need it:

{% set JOB_VERSION = 'A23' %}
[runtime]
    [[foo]]
        script = run-foo --version={{JOB_VERSION}}
    [[bar]]
        script = run-bar --version={{JOB_VERSION}}
16.3.16.2. Inheritance

Sharing by inheritance of task families is recommended when more than a few configuration items are involved.

The simplest application of inheritance is to set global defaults in the [[runtime]][root] namespace that is inherited by all tasks. However, this should only be done for settings that really are used by the vast majority of tasks. Over-sharing of via root, particularly of environment variables, is a maintenance risk because it can be very difficult to be sure which tasks are using which global variables.

Any [runtime] settings can be shared - scripting, host and batch scheduler configuration, environment variables, and so on - from single items up to complete task or app configurations. At the latter extreme, it is quite common to have several tasks that inherit the same complete job configuration followed by minor task-specific additions:

[runtime]
    [[FILE-CONVERT]]
        script = convert-netcdf
        #...
    [[convert-a]]
        inherit = FILE-CONVERT
        [[[environment]]]
              FILE_IN = file-a
    [[convert-b]]
        inherit = FILE-CONVERT
        [[[environment]]]
              FILE_IN = file-b

Inheritance is covered in more detail from an efficiency perspective in The Task Family Hierarchy.

16.3.16.3. Shared Task IO Paths

If one task uses files generated by another task (and both see the same filesystem) a common IO path should normally be passed to both tasks via a shared environment variable. As far as Cylc is concerned this is no different to other shared configuration items, but there are some additional aspects of usage worth addressing here.

Primarily, for self-containment (see Self-Contained Suites) shared IO paths should be under the suite share directory, the location of which is passed to all tasks as $CYLC_SUITE_SHARE_PATH.

The rose task-env utility can provide additional environment variables that refer to static and cyclepoint-specific locations under the suite share directory.

[runtime]
    [[my-task]]
        env-script = $(eval rose task-env -T P1D -T P2D)

For a current cycle point of 20170105 this will make the following variables available to tasks:

ROSE_DATA=$CYLC_SUITE_SHARE_PATH/data
ROSE_DATAC=$CYLC_SUITE_SHARE_PATH/cycle/20170105
ROSE_DATACP1D=$CYLC_SUITE_SHARE_PATH/cycle/20170104
ROSE_DATACP2D=$CYLC_SUITE_SHARE_PATH/cycle/20170103

Subdirectories of $ROSE_DATAC etc. should be agreed between different sub-systems of the suite; typically they are named for the file-generating tasks, and the file-consuming tasks should know to look there.

The share-not-duplicate rule can be relaxed for shared files whose names are agreed by convention, so long as their locations under the share directory are proper shared suite variables. For instance the Unified Model uses a large number of files whose conventional names (glu_snow, for example) can reasonably be expected not to change, so they are typically hardwired into app configurations (as $ROSE_DATA/glu_snow, for example) to avoid cluttering the suite definition.

Here two tasks share a workspace under the suite share directory by inheritance:

# Sharing an I/O location via inheritance.
[scheduling]
    [[dependencies]]
        graph = write_data => read_data
[runtime]
    [[root]]
        env-script = $(eval rose task-env)
    [[WORKSPACE]]
        [[[environment]]]
            DATA_DIR = ${ROSE_DATA}/png
    [[write_data]]
        inherit = WORKSPACE
        script = """
mkdir -p $DATA_DIR
write-data.exe -o ${DATA_DIR}"""
    [[read_data]]
        inherit = WORKSPACE
        script = read-data.exe -i ${DATA_DIR}

In simple cases where an appropriate family does not already exist paths can be shared via Jinja variables:

# Sharing an I/O location with Jinja2.
{% set DATA_DIR = '$ROSE_DATA/stuff' %}
[scheduling]
    [[dependencies]]
        graph = write_data => read_data
[runtime]
    [[write_data]]
        script = """
mkdir -p {{DATA_DIR}}
write-data.exe -o {{DATA_DIR}}"""
    [[read_data]]
        script = read-data.exe -i {{DATA_DIR}}

For completeness we note that it is also possible to configure multiple tasks to use the same work directory so they can all share files in $PWD. (Cylc executes task jobs in special work directories that by default are unique to each task). This may simplify the suite slightly, and it may be useful if you are unfortunate enough to have executables that are designed for IO in $PWD, but it is not recommended. There is a higher risk of interference between tasks; it will break rose task-run incremental file creation mode; and rose task-run --new will in effect delete the work directories of tasks other than its intended target.

# Shared work directory: tasks can read and write in $PWD - use with caution!
[scheduling]
    initial cycle point = 2018
    [[dependencies]]
        [[[P1Y]]]
            graph = write_data => read_data
[runtime]
    [[WORKSPACE]]
        work sub-directory = $CYLC_TASK_CYCLE_POINT/datadir
    [[write_data]]
        inherit = WORKSPACE
        script = write-data.exe
    [[read_data]]
        inherit = WORKSPACE
        script = read-data.exe
16.3.16.4. Varying Behaviour By Cycle Point

To make a cycling job behave differently at different cycle points you could use a single task with scripting that reacts to the cycle point it finds itself running at, but it is better to use different tasks (in different cycling sections) that inherit the same base job configuration. This results in a more transparent suite that can be understood just by inspecting the graph:

# Run the same job differently at different cycle points.
[scheduling]
    initial cycle point = 2020-01-01T00
    [[dependencies]]
        [[[T00]]]
            graph = pre => long_fc => post
        [[[T12]]]
            graph = pre => short_fc => post
[runtime]
    [[MODEL]]
        script = run-model.sh
    [[long_fc]]
        inherit = MODEL
        [[[job]]]
            execution time limit = PT30M
        [[[environment]]]
            RUN_LEN = PT48H
    [[short_fc]]
        inherit = MODEL
        [[[job]]]
            execution time limit = PT10M
        [[[environment]]]
            RUN_LEN = PT12H

The few differences between short_fc and long_fc, including batch scheduler resource requests, can be configured after common settings are inherited.

16.3.16.5. At Start-Up

Similarly, if a cycling job needs special behaviour at the initial (or any other) cycle point, just use a different logical task in an R1 graph and have it inherit the same job as the general cycling task, not a single task with scripting that behaves differently if it finds itself running at the initial cycle point.

16.3.17. Automating Failure Recovery

16.3.17.1. Job Submission Retries

When submitting jobs to a remote host, use job submission retries to automatically resubmit tasks in the event of network outages. Note this is distinct from job retries for job execution failure (just below).

Job submission retries should normally be host (or host-group for rose host-select) specific, not task-specific, so configure them in a host (or host-group) specific family. The following suite.rc fragment configures all HPC jobs to retry on job submission failure up to 10 times at 1 minute intervals, then another 5 times at 1 hour intervals:

[runtime]
    [[HPC]]  # Inherited by all jobs submitted to HPC.
        [[[job]]]
            submission retry delays = 10*PT1M, 5*PT1H
16.3.17.2. Job Execution Retries

Automatic retry on job execution failure is useful if you have good reason to believe that a simple retry will usually succeed. This may be the case if the job host is known to be flaky, or if the job only ever fails for one known reason that can be fixed on a retry. For example, if a model fails occasionally with a numerical instability that can be remedied with a short timestep rerun, then an automatic retry may be appropriate:

[runtime]
    [[model]]
        script = """
if [[ $CYLC_TASK_TRY_NUMBER > 1 ]]; then
    SHORT_TIMESTEP=true
else
    SHORT_TIMESTEP=false
fi
model.exe"""
        [[[job]]]
            execution retry delays = 1*PT0M
16.3.17.3. Failure Recovery Workflows

For recovery from failures that require explicit diagnosis you can configure alternate routes through the workflow, together with suicide triggers that remove the unused route. In the following example, if the model fails a diagnosis task will trigger; if it determines the cause of the failure is a known numerical instability (e.g. by parsing model job logs) it will succeed, triggering a short timestep run. Postprocessing can proceed from either the original or the short-step model run, and suicide triggers remove the unused path from the workflow:

_images/failure-recovery.png
[scheduling]
    [[dependencies]]
        graph = """
            model | model_short => postproc
            model:fail => diagnose => model_short
              # Clean up with suicide triggers:
            model => ! diagnose & ! model_short
            model_short => ! model"""

16.3.18. Include Files

Include-files should not be overused, but they can sometimes be useful (e.g. see Portable Suites):

#...
{% include 'inc/foo.rc' %}

(Technically this inserts a Jinja2-rendered file template). Cylc also has a native include mechanism that pre-dates Jinja2 support and literally inlines the include-file:

#...
%include 'inc/foo.rc'

The two methods normally produce the same result, but use the Jinja2 version if you need to construct an include-file name from a variable (because Cylc include-files get inlined before Jinja2 processing is done):

#...
{% include 'inc/' ~ SITE ~ '.rc' %}

16.4. Efficiency And Maintainability

Efficiency (in the sense of economy of suite definition) and maintainability go hand in hand. This section describes techniques for clean and efficient construction of complex workflows that are easy to understand, maintain, and modify.

16.4.1. The Task Family Hierarchy

A properly designed family hierarchy fulfils three purposes in Cylc:

  • efficient sharing of all configuration common to groups of related tasks
  • efficient bulk triggering, for clear scheduling graphs
  • clean suite visualization and monitoring, because families are collapsible in the GUIs
16.4.1.1. Sharing By Inheritance

Duplication is a maintenance risk because changes have to be repeated in multiple places without mistakes. On the other hand, unnecessary sharing of items via global variables is also bad because it is hard to be sure which tasks are using which variables. A properly designed runtime inheritance hierarchy can give every task exactly what it needs, and nothing that it doesn’t need.

If a group of related tasks has some configuration in common, it can be factored out into a task family inherited by all.

[runtime]
    [[OBSPROC]]
        # Settings common to all obs processing tasks.
    [[obs1]]
        inherit = OBSPROC
    [[obs2]]
        inherit = OBSPROC

If several families have settings in common, they can in turn can inherit from higher-level families.

Multiple inheritance allows efficient sharing even for overlapping categories of tasks. For example consider that some obs processing tasks in the following suite run parallel jobs and some serial:

[runtime]
    [[SERIAL]]
        # Serial job settings.
    [[PARALLEL]]
        # Parallel job settings.
    [[OBSPROC]]
        # Settings for all obs processing tasks.
    [[obs1, obs2, obs3]]
        # Serial obs processing tasks.
        inherit = OBSPROC, SERIAL
    [[obs4, obs5]]
        # Parallel obs processing tasks.
        inherit = OBSPROC, PARALLEL

Note that suite parameters should really be used to define family members efficiently - see Generating Tasks Automatically.

Cylc provides tools to help make sense of your inheritance hierarchy:

  • cylc graph -n/--namespaces - plot the full multiple inheritance graph (not the dependency graph)
  • cylc get-config SUITE - print selected sections or items after inheritance processing
  • cylc graph SUITE - plot the dependency graph, with collapsible first-parent families (see Task Families And Visualization)
  • cylc list -t/--tree SUITE - print the first-parent inheritance hierarchy
  • cylc list -m/--mro SUITE - print the inheritance precedence order for each runtime namespace
16.4.1.2. Family Triggering

Task families can be used to simplify the scheduling graph wherever many tasks need to trigger at once:

[scheduling]
    [[dependencies]]
        graph = pre => MODELS
[runtime]
    [[MODELS]]
    [[model1, model2, model3, ...]]
        inherit = MODELS

To trigger off of many tasks at once, family names need to be qualified by <state>-all or <state>-any to indicate the desired member-triggering semantics:

[scheduling]
    [[dependencies]]
        graph = """pre => MODELS
                MODELS:succeed-all => post"""

Note that this can be simplified further because Cylc ignores trigger qualifiers like :succeed-all on the right of trigger arrows to allow chaining of dependencies:

[scheduling]
    [[dependencies]]
        graph = pre => MODELS:succeed-all => post
16.4.1.3. Family-to-Family Triggering
[scheduling]
    [[dependencies]]
        graph = BIG_FAM_1:succeed-all => BIG_FAM_2

This means every member of BIG_FAM_2 depends on every member of BIG_FAM_1 succeeding. For very large families this can create so many dependencies that it affects the performance of Cylc at run time, as well as cluttering graph visualizations with unnecessary edges. Instead, interpose a dummy task that signifies completion of the first family:

[scheduling]
    [[dependencies]]
        graph = BIG_FAM_1:succeed-all => big_fam_1_done => BIG_FAM_2

For families with M and N members respectively, this reduces the number of dependencies from M*N to M+N without affecting the scheduling.

_images/fam-to-fam-1.png _images/fam-to-fam-2.png
16.4.1.4. Task Families And Visualization

First parents in the inheritance hierarchy double as collapsible summary groups for visualization and monitoring. Tasks should generally be grouped into visualization families that reflect their logical purpose in the suite rather than technical detail such as inherited job submission or host settings. So in the example under Sharing By Inheritance above all obs<n> tasks collapse into OBSPROC but not into SERIAL or PARALLEL.

If necessary you can introduce new namespaces just for visualization:

[runtime]
    [[MODEL]]
        # (No settings here - just for visualization).
    [[model1, model2]]
        inherit = MODEL, HOSTX
    [[model3, model4]]
        inherit = MODEL, HOSTY

To stop a solo parent being used in visualization, demote it to secondary with a null parent like this:

[runtime]
    [[SERIAL]]
    [[foo]]
        # Inherit settings from SERIAL but don't use it in visualization.
        inherit = None, SERIAL

16.4.2. Generating Tasks Automatically

Groups of tasks that are closely related such as an ensemble of model runs or a family of obs processing tasks, or sections of workflow that are repeated with minor variations, can be generated automatically by iterating over some integer range (e.g. model<n> for n = 1..10) or list of strings (e.g. obs<type> for type = ship, buoy, radiosonde, ...).

16.4.2.1. Jinja2 Loops

Task generation was traditionally done in Cylc with explicit Jinja2 loops, like this:

# Task generation the old way: Jinja2 loops (NO LONGER RECOMMENDED!)
{% set PARAMS = range(1,11) %}
[scheduling]
    [[dependencies]]
        graph = """
{% for P in PARAMS %}
      pre => model_p{{P}} => post
      {% if P == 5 %}
          model_p{{P}} => check
      {% endif %}
{% endfor %}    """
[runtime]
{% for P in PARAMS %}
    [[model_p{{P}}]]
        script = echo "my parameter value is {{P}}"
    {% if P == 1 %}
        # special case...
    {% endif %}
{% endfor %}

Unfortunately this makes a mess of the suite definition, particularly the scheduling graph, and it gets worse with nested loops over multiple parameters.

_images/param-1.png
16.4.2.2. Parameterized Tasks

Cylc-6.11 introduced built-in suite parameters for generating tasks without destroying the clarity of the base suite definition. Here’s the same example using suite parameters instead of Jinja2 loops:

# Task generation the new way: suite parameters.
[cylc]
    [[parameters]]
        p = 1..10
[scheduling]
    [[dependencies]]
        graph = """pre => model<p> => post
                model<p=5> => check"""
[runtime]
    [[model<p>]]
        script = echo "my parameter value is ${CYLC_TASK_PARAM_p}"
    [[model<p=7>]]
        # special case ...

Here model<p> expands to model_p7 for p=7, and so on, via the default expansion template for integer-valued parameters, but custom templates can be defined if necessary. Parameters can also be defined as lists of strings, and you can define dependencies between different values: chunk<p-1> => chunk<p>. Here’s a multi-parameter example:

[cylc]
    [[parameters]]
        run = a, b, c
        m = 1..5
[scheduling]
    [[dependencies]]
        graph = pre => init<run> => sim<run,m> => close<run> => post
[runtime]
    [[sim<run,m>]]
_images/param-2.png

If family members are defined by suite parameters, then parameterized trigger expressions are equivalent to family :<state>-all triggers. For example, this:

[cylc]
    [[parameters]]
        n = 1..5
[scheduling]
    [[dependencies]]
        graph = pre => model<n> => post
[runtime]
    [[MODELS]]
    [[model<n>]]
        inherit = MODELS

is equivalent to this:

[cylc]
    [[parameters]]
        n = 1..5
[scheduling]
    [[dependencies]]
        graph = pre => MODELS:succeed-all => post
[runtime]
    [[MODELS]]
    [[model<n>]]
        inherit = MODELS

(but future plans for family triggering may make the second case more efficient for very large families).

For more information on parameterized tasks see the Cylc user guide.

16.4.3. Optional App Config Files

Closely related tasks with few configuration differences between them - such as multiple UM forecast and reconfiguration apps in the same suite - should use the same Rose app configuration with the differences supplied by optional configs, rather than duplicating the entire app for each task.

Optional app configs should be valid on top of the main app config and not dependent on the use of other optional app configs. This ensures they will work correctly with macros and can therefore be upgraded automatically.

Note

Currently optional configs don’t work very well with UM STASH configuration - see UM STASH in Optional App Configs.

Optional app configs can be loaded by command line switch:

rose task-run -O key1 -O key2

or by environment variable:

ROSE_APP_OPT_CONF_KEYS = key1 key2

The environment variable is generally preferred in suites because you don’t have to repeat and override the root-level script configuration:

[runtime]
    [[root]]
        script = rose task-run -v
    [[foo]]
        [[[environment]]]
            ROSE_APP_OPT_CONF_KEYS = key1 key2

16.5. Portable Suites

A portable or interoperable suite can run “out of the box” at different sites, or in different environments such as research and operations within a site. For convenience we just use the term site portability.

Lack of portability is a major barrier to collaborative development when sites need to run more or less the same workflow, because it is very difficult to translate changes manually between large, complicated suites.

Most suites are riddled with site-specific details such as local build configurations, file paths, host names, and batch scheduler directives, etc.; but it is possible to cleanly factor all this out to make a portable suite. Significant variations in workflow structure can even be accommodated quite easily. If the site workflows are too different, however, you may decide that it is appropriate for each site to maintain separate suites.

The recommended way to do this, which we expand on below, is:

  • Put all site-specific settings in include-files loaded at the end of a generic “core” suite definition.
  • Use “optional” app config files for site-specific variations in the core suite’s Rose apps.
  • (Make minimal use of inlined site switches too, if necessary).
  • When referencing files, reference them within the suite structure and use an install task to link external files in.

The result should actually be tidier than the original in one respect: all the messy platform-specific resource directives etc., will be hidden away in the site include-files.

16.5.1. The Jinja2 SITE Variable

First a suite Jinja2 variable called SITE should be set to the site name, either in rose-suite.conf, or in the suite definition itself (perhaps automatically, by querying the local environment in some way).

#!Jinja2
{% set SITE = "niwa" %}
#...

This will be used to select site-specific configuration, as described below.

16.5.2. Site Include-Files

If a section heading in a suite.rc file is repeated the items under it simply add to or override those defined under the same section earlier in the file (but note List Item Override In Site Include-Files). For example, this task definition:

[runtime]
    [[foo]]
        script = run-foo.sh
        [[[remote]]]
            host = hpc1.niwa.co.nz

can equally be written like this:

[runtime]  # Part 1 (site-agnostic).
    [[foo]]
        script = run-foo.sh
[runtime]  # Part 2 (site-specific).
    [[foo]]
        [[[remote]]]
            host = hpc1.niwa.co.nz

Note

If Part 2 had also defined script the new value would override the original. It can sometimes be useful to set a widely used default and override it in a few cases, but be aware that this can make it more difficult to determine the origin of affected values.

In this way all site-specific [runtime] settings, with their respective sub-section headings, can be moved to the end of the file, and then out into an include-file (file inclusion is essentially just literal inlining):

#...
{% set SITE = "niwa" %}

# Core site-agnostic settings:
#...
[runtime]
    [[foo]]
        script = run-foo.sh
#...

# Site-specific settings:
{% include 'site/' ~ SITE ~ '.rc' %}

where the site include-file site/niwa.rc contains:

# site/niwa.rc
[runtime]
    [[foo]]
        [[[remote]]]
            host = hpc1.niwa.co.nz

16.5.3. Site-Specific Graphs

Repeated graph strings under the same graph section headings are always additive (graph strings are the only exception to the normal repeat item override semantics). So, for instance, this graph:

[scheduling]
    initial cycle point = 2025
    [[dependencies]]
        [[[P1Y]]]
            graph = "pre => model => post => niwa_archive"

can be written like this:

[scheduling]
    initial cycle point = 2025
    [[dependencies]]
        [[[P1Y]]]
            graph = "pre => model => post"
        [[[P1Y]]]
            graph = "post => niwa_archive"

and again, the site-specific part can be taken out to a site include-file:

#...
{% set SITE = "niwa" %}

# Core site-agnostic settings.
#...
[scheduling]
    initial cycle point = 2025
    [[dependencies]]
        [[[P1Y]]]
            graph = "pre => model => post"
#...
# Site-specific settings:
{% include 'site/' ~ SITE ~ '.rc' %}

where the site include-file site/niwa.rc contains:

# site/niwa.rc
[scheduling]
    [[dependencies]]
        [[[P1Y]]]
            graph = "post => niwa_archive"

Note that the site-file graph needs to define the dependencies of the site-specific tasks, and thus their points of connection to the core suite - which is why the core task post appears in the graph here (if post had any site-specific runtime settings, to get it to run at this site, they would also be in the site-file).

16.5.4. Inlined Site-Switching

It may be tempting to use inlined switch blocks throughout the suite instead of site include-files, but this is not recommended - it is verbose and untidy (the greater the number of supported sites, the bigger the mess) and it exposes all site configuration to all users:

#...
[runtime]
    [[model]]
        script = run-model.sh
{# Site switch blocks not recommended:#}
{% if SITE == 'niwa' %}
        [[[job]]]
            batch system = loadleveler
        [[[directives]]]
            # NIWA Loadleveler directives...
{% elif SITE == 'metoffice' %}
        [[[job]]]
            batch system = pbs
        [[[directives]]]
            # Met Office PBS directives...
{% elif SITE == ... %}
            #...
{% else %}
    {{raise('Unsupported site: ' ~ SITE)}}
{% endif %}
    #...

Inlined switches can be used, however, to configure exceptional behaviour at one site without requiring the other sites to duplicate the default behaviour. But be wary of accumulating too many of these switches:

# (core suite.rc file)
#...
{% if SITE == 'small' %}
   {# We can't run 100 members... #}
   {% set ENSEMBLE_SIZE = 25 %}
{% else %}
   {# ...but everyone else can! #}
   {% set ENSEMBLE_SIZE = 100 %}
{% endif %}
#...

Inlined switches can also be used to temporarily isolate a site-specific change to a hitherto non site-specific part of the suite, thereby avoiding the need to update all site include-files before getting agreement from the suite owner and collaborators.

16.5.5. Site-Specific Suite Variables

It can sometimes be useful to set site-specific values of suite variables that aren’t exposed to users via rose-suite.conf. For example, consider a suite that can run a special post-processing workflow of some kind at sites where IDL is available. The IDL-dependence switch can be set per site like this:

#...
{% from SITE ~ '-vars.rc' import HAVE_IDL, OTHER_VAR %}
graph = """
  pre => model => post
{% if HAVE_IDL %}
      post => idl-1 => idl-2 => idl-3
{% endif %}
        """

where for SITE = niwa the file niwa-vars.rc contains:

{# niwa-vars.rc #}
{% set HAVE_IDL = True %}
{% set OTHER_VAR = "the quick brown fox" %}

Note we are assuming there are significantly fewer options (IDL or not, in this case) than sites, otherwise the IDL workflow should just go in the site include-files of the sites that need it.

16.5.6. Site-Specific Optional Suite Configs

During development and testing of a portable suite you can use an optional Rose suite config file to automatically set site-specific suite inputs and thereby avoid the need to make manual changes every time you check out and run a new version. The site switch itself has to be set of course, but there may be other settings too such as model parameters for a standard local test domain. Just put these settings in opt/rose-suite-niwa.conf (for site “niwa”) and run the suite with rose suite-run -O niwa.

16.5.7. Site-Agnostic File Paths in App Configs

Where possible apps should be configured to reference files within the suite structure itself rather than outside of it. This makes the apps themselves portable and it becomes the job of the install task to ensure all required source files are available within the suite structure e.g. via symlink into the share directory. Additionally, by moving the responsibility of linking files into the suite to an install task you gain the added benefit of knowing if a file is missing at the start of a suite rather than part way into a run.

16.5.8. Site-Specific Optional App Configs

Typically a few but not all apps will need some site customization, e.g. for local archive configuration, local science options, or whatever. To avoid explicit site-customization of individual task-run command lines use Rose’s built-in optional app config capability:

[runtime]
    [[root]]
        script = rose task-run -v -O '({{SITE}})'

Normally a missing optional app config is considered to be an error, but the round parentheses here mean the named optional config is optional - i.e. use it if it exists, otherwise ignore.

With this setting in place we can simply add a opt/rose-app-niwa.conf to any app that needs customization at SITE = niwa.

16.5.9. An Example

The following small suite is not portable because all of its tasks are submitted to a NIWA HPC host; two task are entirely NIWA-specific in that they respectively install files from a local database and upload products to a local distribution system; and one task runs a somewhat NIWA-specific configuration of a model. The remaining tasks are site-agnostic apart from local job host and batch scheduler directives.

[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 2017-01-01
    [[dependencies]]
        [[[R1]]]
            graph = install_niwa => preproc
        [[[P1D]]]
            graph = """
                preproc & model[-P1D] => model => postproc => upload_niwa
                postproc => idl-1 => idl-2 => idl-3"""
[runtime]
    [[root]]
        script = rose task-run -v
    [[HPC]]  # NIWA job host and batch scheduler settings.
        [[[remote]]]
            host = hpc1.niwa.co.nz
        [[[job]]]
            batch system = loadleveler
        [[[directives]]]
            account_no = NWP1623
            class = General
            job_type = serial  # (most jobs in this suite are serial)
    [[install_niwa]]  # NIWA-specific file installation task.
        inherit = HPC
    [[preproc]]
        inherit = HPC
    [[model]]  # Run the model on a local test domain.
        inherit = HPC
        [[[directives]]]  # Override the serial job_type setting.
            job_type = parallel
        [[[environment]]]
            SPEED = fast
    [[postproc]]
        inherit = HPC
    [[upload_niwa]]  # NIWA-specific product upload.
        inherit = HPC

To make this portable, refactor it into a core suite.rc file that contains the clean site-independent workflow configuration and loads all site-specific settings from an include-file at the end:

# suite.rc: CORE SITE-INDEPENDENT CONFIGURATION.
{% set SITE = 'niwa' %}
{% from 'site/' ~ SITE ~ '-vars.rc' import HAVE_IDL %}
[cylc]
    UTC mode = True
[scheduling]
    initial cycle point = 2017-01-01
    [[dependencies]]
        [[[P1D]]]
            graph = """
preproc & model[-P1D] => model => postproc
{% if HAVE_IDL %}
    postproc => idl-1 => idl-2 => idl-3
{% endif %}
                    """
[runtime]
    [[root]]
        script = rose task-run -v -O '({{SITE}})'
    [[preproc]]
        inherit = HPC
    [[preproc]]
        inherit = HPC
    [[model]]
        inherit = HPC
        [[[environment]]]
            SPEED = fast
{% include 'site/' ~ SITE ~ '.rc' %}

plus site files site/niwa-vars.rc:

# site/niwa-vars.rc: NIWA SITE SETTINGS FOR THE EXAMPLE SUITE.
{% set HAVE_IDL = True %}

and site/niwa.rc:

# site/niwa.rc: NIWA SITE SETTINGS FOR THE EXAMPLE SUITE.
[scheduling]
    [[dependencies]]
        [[[R1]]]
            graph = install_niwa => preproc
        [[[P1D]]]
            graph = postproc => upload_niwa
[runtime]
    [[HPC]]
        [[[remote]]]
            host = hpc1.niwa.co.nz
        [[[job]]]
            batch system = loadleveler
        [[[directives]]]
            account_no = NWP1623
            class = General
            job_type = serial  # (most jobs in this suite are serial)
    [[install_niwa]]  # NIWA-specific file installation.
    [[model]]
        [[[directives]]]  # Override the serial job_type setting.
            job_type = parallel
    [[upload_niwa]]  # NIWA-specific product upload.

and finally, an optional app config file for the local model domain:

app/model/rose-app.conf  # Main app config.
app/model/opt/rose-app-niwa.conf  # NIWA site settings.

Some points to note:

  • It is straightforward to extend support to a new site by copying an existing site file(s) and adapting it to the new job host and batch scheduler etc.
  • Batch system directives should be considered site-specific unless all supported sites have the same batch system and the same host architecture (including CPU clock speed and memory size etc.).
  • We’ve assumed that all tasks run on a single HPC host at both sites. If that’s not a valid assumption the HPC family inheritance relationships would have to become site-specific.
  • Core task runtime configuration aren’t needed in site files at all if their job host and batch system settings can be defined in common families that are (HPC in this case).

16.5.10. Collaborative Development Model

Official releases of a portable suite should be made from the suite trunk.

Changes should be developed on feature branches so as not to affect other users of the suite.

Site-specific changes shouldn’t touch the core suite.rc file, just the relevant site include-file, and therefore should not need close scrutiny from other sites.

Changes to the core suite.rc file should be agreed by all stakeholders, and should be carefully checked for effects on site include-files:

  • Changing the name of tasks or families in the core suite may break sites that add configuration to the original runtime namespace.
  • Adding new tasks or families to the core suite may require corresponding additions to the site files.
  • Deleting tasks or families from the core suite may require corresponding parts of the site files to be removed. And also, check for site-specific triggering off of deleted tasks or families.

However, if the owner site has to get some changes into the trunk before all collaborating sites have time to test them, version control will of course protect those lagging behind from any immediate ill effects.

When a new feature is complete and tested at the developer’s site, the suite owner should check out the branch, review and test it, and if necessary request that other sites do the same and report back. The owner can then merge the new feature to the trunk once satisfied.

All planning and discussion associated with the change should be documented on MOSRS Trac tickets associated with the suite.

16.5.11. Research-To-Operations Transition

Under this collaborative development model it is possible to use the same suite in research and operations, largely eliminating the difficult translation between the two environments. Where appropriate, this can save a lot of work.

Operations-specific parts of the suite should be factored out (as for site portability) into include-files that are only loaded in the operational environment. Improvements and upgrades can be developed on feature branches in the research environment. Operations staff can check out completed feature branches for testing in the operational environment before merging to trunk or referring back to research if problems are found. After sufficient testing the new suite version can be deployed into operations.

Note

This obviously glosses over the myriad complexities of the technical and scientific testing and validation of suite upgrades; it merely describes what is possible from a suite design and collaborative development perspective.

16.6. Roadmap

Several planned future developments in Rose and Cylc may have an impact on suite design.

16.6.1. List Item Override In Site Include-Files

A few Cylc config items hold lists of task (or family) names, e.g.:

[scheduling]
    [[special tasks]]
        clock-trigger = get-data-a, get-data-b
    #...
#...

Currently a repeated config item completely overrides a previously set value (apart from graph strings which are always additive). This means a site include-file (for example) can’t add a new site-specific clock-triggered task without writing out the complete list of all clock-triggered tasks in the suite, which breaks the otherwise clean separation into core and site files.

Note

In the future we plan to support add, subtract, unset, and override semantics for all items.

16.6.2. UM STASH in Optional App Configs

A caveat to the advice on use of option app configs in Optional App Config Files: in general you might need the ability to turn off or modify some STASH requests in the main app, not just add additional site-specific STASH. But overriding STASH in optional configs is fragile because STASH namelists names are automatically generated from a hash of the precise content of the namelist. This makes it possible to uniquely identify the same STASH requests in different apps, but if any detail of a STASH request changes in a main app its namelist name will change and any optional configs that refer to it will become divorced from their intended target.

Until this problem is solved we recommend that:

  • All STASH in main UM apps should be grouped into sensible packages that can be turned on and off in optional configs without referencing the individual STASH request namelists.
  • Or all STASH should be held in optional site configs and none in the main app. Note however that STASH is difficult to configure outside of rose edit, and the editor does not yet allow you to edit optional configs.

16.6.3. Modular Suite Design

The modular suite design concept is that we should be able to import common workflow segments at install time rather than duplicating them in each suite. The content of a suite module will be encapsulated in a protected namespace to avoid clashing with the importing suite, and selected inputs and outputs exposed via a proper interface.

This should aid portable suite design too by enabling site-specific parts of a workflow (local product generation for example) to be stored and imported on-site rather than polluting the source and revision control record of the core suite that everyone sees.

We note that this can already be done to a limited extent by using rose suite-run to install suite.rc fragments from an external location. However, as a literal inlining mechanism with no encapsulation or interface, the internals of the “imported” fragments would have to be compatible with the suite definition in every respect.

See also Monolithic Or Interdependent Suites on modular systems of suites connected by inter-suite triggering.