Datetime Cycling
Aims
In the last section we created an integer cycling workflow with numbered cycle points.
Workflows may need to be repeated at a regular time intervals, say every day or every few hours. To support this Cylc can generate datetime sequences as cycle points instead of integers.
Reminder
In Cylc, cycle points are task labels that anchor the dependencies between individual tasks: this task depends on that task in that cycle. Tasks can run as soon as their individual dependencies are met, so cycles do not necessarily run in order, or at the real world time corresponding to the cycle point value (to do that, see Clock Triggers).
ISO8601
Cylc uses the ISO8601 datetime standard to represent datetimes and durations.
ISO8601 Datetimes
ISO8601 datetimes are written from the largest unit to the smallest
(year, month, day, hour, minute, second) with a T
separating the date
and time components. For example, midnight on the 1st of January 2000 is
written 20000101T000000
.
For brevity we can omit seconds (or minutes) from the time:
20000101T0000
(or 20000101T00
).
Note
The smallest interval for a datetime cycling sequence in Cylc is 1 minute.
For readability we can add hyphens (-
) between the date components
and colons (:
) between the time components.
This is optional, but if you do it you must use both hyphens and colons.
Time-zone information can be added onto the end. UTC is written Z
,
UTC+1 is written +01
, etc. E.G: 2000-01-01T00:00Z
.
ISO8601 Durations
ISO8601 durations are prefixed with a P
(for “period”) and
special characters following each unit:
Y
for year.M
for month.D
for day.W
for week.H
for hour.M
for minute.S
for second.
As for datetimes, duration components are written in order from largest to
smallest, and the date and time components are separated by a T
:
P1D
: one day.PT1H
: one hour.P1DT1H
: one day and one hour.PT1H30M
: one and a half hours.P1Y1M1DT1H1M1S
: a year and a month and a day and an hour and a minute and a second.
Datetime Recurrences
In integer cycling, workflows, recurrences are written P1
, P2
,
etc.
In datetime cycling workflows, there are two ways to write recurrences:
Using ISO8601 durations (e.g.
P1D
,PT1H
).Using ISO8601 datetimes with inferred recurrence.
Inferred Recurrence
Recurrence can be inferred from a datetime by omitting components from the front. For example, if the year is omitted then the recurrence can be inferred to be annual. E.g.:
Recurrence |
Description |
---|---|
|
Midnight on the 1st of January 2000 |
|
Every year on the 1st of January |
|
Every month on the first of the month |
|
Every day at midnight |
|
Every hour at zero minutes past (i.e. every hour on the hour) |
Note
To omit hours from a date time, place a -
after the T
character.
Recurrence Formats
As with integer cycling, recurrences start at the initial cycle point by default. We can override this in two ways:
By giving an arbitrary start cycle point (datetime/recurrence
):
2000/P4Y
Every fourth year, starting with the year 2000.
2000-01-01T00/P1D
Every day at midnight, starting on the 1st of January 2000.
By offset, relative to the initial cycle point (offset/recurrence
).
The offset must be an ISO8601 duration preceded by a plus character:
+PT1H/PT1H
Every hour starting one hour after the initial cycle point.
+P1Y/P1Y
Every year starting one year after the initial cycle point.
Durations and the Initial Cycle Point
When using durations, beware that a change in the initial cycle point might produce different results for the recurrences.
[scheduling]
initial cycle point = \
2000-01-01T00
[[graph]]
P1D = foo[-P1D] => foo
|
[scheduling]
initial cycle point = \
2000-01-01T12
[[graph]]
P1D = foo[-P1D] => foo
|
We could write the recurrence “every midnight” independent of the initial cycle point by:
Using an inferred recurrence instead (i.e.
T00
).Overriding the recurrence start point (i.e.
T00/P1D
)Using
[scheduling]initial cycle point constraints
to constrain the initial cycle point (e.g. to a particular time of day). See the Cylc User Guide for details.
The Initial and Final Cycle Points
There are two special recurrences for the initial and final cycle points:
R1
: run once at the initial cycle point.R1/P0Y
: run once at the final cycle point.
Intercycle Dependencies
Intercycle dependencies are written as ISO8601 durations, e.g:
foo[-P1D]
: the taskfoo
from the cycle one day before.bar[-PT1H30M]
: the taskbar
from the cycle 1 hour 30 minutes before.
The initial cycle point can be referenced using a caret character ^
, e.g:
baz[^]
: the taskbaz
from the initial cycle point.
Cycle Point Time Zone
Cylc can generate datetime cycle points in any time zone, but “daylight saving”
boundaries can cause confusion, so the default is UTC, i.e. the +00
time
zone. You can override this by setting [scheduler]cycle point time zone
.
Note
UTC is sometimes also labelled Z
(“Zulu” from the NATO phonetic alphabet)
according to the
military time zone convention.
Putting It All Together
Cylc was originally developed for running operational weather forecasting. In this section we will outline how to implement a basic weather-forecasting workflow.
Note
Technically this example is a nowcasting workflow, but the distinction doesn’t matter here.
A basic weather forecasting workflow has three main steps:
1. Gathering Observations
We gather observations from different weather stations to build a picture of the current weather. Our dummy weather forecast will get wind observations from four weather stations:
Aldergrove (Near Belfast in NW of the UK)
Camborne (In Cornwall, the far SW of England)
Heathrow (Near London in the SE)
Shetland (The northernmost part of the UK)
The tasks that retrieve observation data will be called
get_observations_<site>
where site
is the name of the weather
station.
Next we need to consolidate the observations so that our forecasting
system can work with them. To do this we have a
consolidate_observations
task.
We will fetch wind observations every three hours, starting from the initial cycle point.
The consolidate_observations
task must run after the
get_observations<site>
tasks.
We will also use the UK radar network to get rainfall data with a task
called get_rainfall
.
We will fetch rainfall data every six hours, from six hours after the initial cycle point.
2. Running Computer Models to Generate Forecast Data
We will do this with a task called forecast
that runs
every six hours, from six hours after the initial cycle point.
The forecast
task will depend on:
The
consolidate_observations
task from the previous two cycles and the present cycle.The
get_rainfall
task from the present cycle.
3. Processing the data output to produce user-friendly forecasts
This will be done with a task called post_process_<location>
where
location
is the place we want to generate the forecast for. For
the moment we will use Exeter.
The post_process_exeter
task will run every six hours starting six
hours after the initial cycle point and will be dependent on the
forecast
task.
Practical
In this practical we will create a dummy forecasting workflow using datetime cycling.
Create A New Workflow.
Create a new source directory
datetime-cycling
under~/cylc-src
, and move into it:mkdir ~/cylc-src/datetime-cycling cd ~/cylc-src/datetime-cycling
Create a
flow.cylc
file and paste the following code into it:[scheduler] UTC mode = True allow implicit tasks = True [scheduling] initial cycle point = 20000101T00Z [[graph]]
Add The Recurrences.
The weather-forecasting workflow will require two recurrences. Add these under the
graph
section based on the information given above.Hint
See Datetime Recurrences.
Solution
The two recurrences you need are
PT3H
: repeat every three hours starting from the initial cycle point.+PT6H/PT6H
: repeat every six hours starting six hours after the initial cycle point.
[scheduler] UTC mode = True allow implicit tasks = True [scheduling] initial cycle point = 20000101T00Z [[graph]] + PT3H = + +PT6H/PT6H =
Write The Graph.
With the help of the the information above add the tasks and dependencies to to implement the weather-forecasting workflow.
You will need to consider the intercycle dependencies between tasks as well.
Use
cylc graph
to inspect your work.Hint
The dependencies you will need to formulate are as follows:
The
consolidate_observations
task depends onget_observations_<site>
.The
forecast
task depends on:the
get_rainfall
task;the
consolidate_observations
tasks from:the same cycle;
the cycle 3 hours before (
-PT3H
);the cycle 6 hours before (
-PT6H
).
The
post_process_exeter
task depends on theforecast
task.
To visualise your workflow run the command:
cylc graph <path/to/flow.cylc>
Solution
[scheduler] UTC mode = True allow implicit tasks = True [scheduling] initial cycle point = 20000101T00Z [[graph]] PT3H = """ get_observations_aldergrove => consolidate_observations get_observations_camborne => consolidate_observations get_observations_heathrow => consolidate_observations get_observations_shetland => consolidate_observations """ +PT6H/PT6H = """ consolidate_observations => forecast consolidate_observations[-PT3H] => forecast consolidate_observations[-PT6H] => forecast get_rainfall => forecast => post_process_exeter """
Intercycle Offsets.
To ensure the
forecast
tasks run in the right order (one cycle after another) they each need to depend on their own previous run:We can express this dependency as
forecast[-PT6H] => forecast
.However, the forecast task runs every six hours starting 6 hours after the initial cycle point, so the dependency is only valid from 12:00 onwards. To fix the problem we must add a new dependency section which repeats every six hours starting 12 hours after the initial cycle point:
+PT6H/PT6H = """ ... - forecast[-PT6H] => forecast """ + +PT12H/PT6H = """ + forecast[-PT6H] => forecast + """