Parameterized Tasks
Parameterized tasks (see parameterization) provide a way of implicitly looping over tasks without the need for Jinja2.
Cylc Parameters
Parameters are defined in their own section, e.g:
[task parameters]
world = Mercury, Venus, Earth
They can then be referenced by writing the name of the parameter in angle brackets, e.g:
[scheduling]
[[graph]]
R1 = start => hello<world> => end
[runtime]
[[hello<world>]]
script = echo 'Hello World!'
When the flow.cylc
file is read by Cylc, the parameters will be expanded.
For example the code above is equivalent to:
[scheduling]
[[graph]]
R1 = """
start => hello_Mercury => end
start => hello_Venus => end
start => hello_Earth => end
"""
[runtime]
[[hello_Mercury]]
script = echo 'Hello World!'
[[hello_Venus]]
script = echo 'Hello World!'
[[hello_Earth]]
script = echo 'Hello World!'
We can refer to a specific parameter by writing it after an =
sign:
[runtime]
[[hello<world=Earth>]]
script = echo 'Greetings Earth!'
Environment Variables
The name of the parameter is provided to the job as an environment variable
called CYLC_TASK_PARAM_<parameter>
where <parameter>
is the name of
the parameter (in the present case world
):
[runtime]
[[hello<world>]]
script = echo "Hello ${CYLC_TASK_PARAM_world}!"
Parameter Types
Parameters can be either strings or integers:
[task parameters]
foo = 1..5
bar = 1..5..2
baz = pub, qux, bol
Hint
Remember that by default Cylc automatically inserts an underscore between the task and the parameter, e.g. the following lines are equivalent:
task<baz=pub>
task_pub
Hint
When using integer parameters, to prevent confusion, Cylc prefixes the parameter value with the parameter name. For example:
[scheduling]
[[graph]]
R1 = """
# task<bar> would result in:
task_bar1
task_bar3
task_bar5
# task<baz> would result in:
task_pub
task_qux
task_bol
"""
Using parameters the get_observations
configuration could be written like
so:
[scheduling]
[[graph]]
T00/PT3H = """
get_observations<station> => consolidate_observations
"""
[runtime]
[[get_observations<station>]]
script = get-observations
[[[environment]]]
API_KEY = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
[[get_observations<station=aldergrove>]]
[[[environment]]]
SITE_ID = 3917
[[get_observations<station=camborne>]]
[[[environment]]]
SITE_ID = 3808
[[get_observations<station=heathrow>]]
[[[environment]]]
SITE_ID = 3772
[[get_observations<station=shetland>]]
[[[environment]]]
SITE_ID = 3005
For more information see the Cylc User Guide.
Practical
This practical continues on from the Jinja2 practical.
Use Parameterization To Consolidate The
get_observations
Tasks.Next we will parameterize the
get_observations
tasks.Add a parameter called
station
:+[task parameters] + station = aldergrove, camborne, heathrow, shetland [scheduler] UTC mode = True
Remove the four
get_observations
tasks and insert the following code in their place:[[get_observations<station>]] script = get-observations [[[environment]]] API_KEY = {{ API_KEY }}
Using
cylc config
you should see that Cylc replaces the<station>
with each of the stations in turn, creating a new task for each:cylc config . -i "[runtime]"
The
get_observations
tasks are now missing theSITE_ID
environment variable. Add a new section for each station with aSITE_ID
:[[get_observations<station=heathrow>]] [[[environment]]] SITE_ID = 3772
Hint
The relevant IDs are:
Aldergrove -
3917
Camborne -
3808
Heathrow -
3772
Shetland -
3005
Solution
[[get_observations<station=aldergrove>]] [[[environment]]] SITE_ID = 3917 [[get_observations<station=camborne>]] [[[environment]]] SITE_ID = 3808 [[get_observations<station=heathrow>]] [[[environment]]] SITE_ID = 3772 [[get_observations<station=shetland>]] [[[environment]]] SITE_ID = 3005
Using
cylc config
you should now see fourget_observations
tasks, each with ascript
, anAPI_KEY
and aSITE_ID
:cylc config . -i "[runtime]"
Finally we can use this parameterization to simplify the workflow’s graphing. Replace the
get_observations
lines in the graph withget_observations<station>
:# Repeat every three hours starting at the initial cycle point. PT3H = """ - get_observations_aldergrove => consolidate_observations - get_observations_camborne => consolidate_observations - get_observations_heathrow => consolidate_observations - get_observations_shetland => consolidate_observations + get_observations<station> => consolidate_observations """
Hint
The
cylc config
command does not expand parameters or families in the graph so you must usecylc graph
to inspect changes to the graphing.Use Parameterization To Consolidate The
post_process
Tasks.At the moment we only have one
post_process
task (post_process_exeter
), but suppose we wanted to add a second task for Edinburgh.Create a new parameter called
site
and set it to containexeter
andedinburgh
. Parameterize thepost_process
task using this parameter.Hint
The first argument to the
post-process
task is the name of the site. We can use theCYLC_TASK_PARAM_site
environment variable to avoid having to write out this section twice.Solution
First we must create the
site
parameter:[scheduler] UTC mode = True [task parameters] station = aldergrove, camborne, heathrow, shetland + site = exeter, edinburgh
Next we parameterize the task in the graph:
-get_rainfall => forecast => post_process_exeter +get_rainfall => forecast => post_process<site>
And also the runtime:
-[[post_process_exeter]] +[[post_process<site>]] # Generate a forecast for Exeter 60 minutes in the future. - script = post-process exeter 60 + script = post-process $CYLC_TASK_PARAM_site 60