Custom Job Submission Methods
- class cylc.flow.job_runner_handlers.documentation.ExampleHandler[source]
Documentation for writing job runner handlers.
Cylc can submit jobs to a number of different job runners (aka batch systems) e.g. Slurm and PBS. For a list of built-in integrations see Supported Job Submission Methods.
If the job runner you require is not on this list, Cylc provides a generic interface for writing your own integration.
Defining a new job runner handler requires a little Python programming. Use the built-in handlers (e.g.
cylc.flow.job_runner_handlers.background
) as examples.Installation
Custom job runner handlers must be installed on workflow and job hosts in one of these locations:
under
WORKFLOW-RUN-DIR/lib/python/
under
CYLC-PATH/cylc/flow/job_runner_handlers/
or anywhere in
$PYTHONPATH
Each module should export the symbol
JOB_RUNNER_HANDLER
for the singleton instance that implements the job system handler logic e.g:class MyHandler(): pass JOB_RUNNER_HANDLER = MyHandler()
Each job runner handler class should instantiate with no argument.
Usage
You can then define a Cylc platform using the handler:
[platforms] [[my_platform]] job runner = my_handler # note matches Python module name hosts = localhost
And configure tasks to submit to it:
[runtime] [[my_task]] script = echo "Hello World!" platform = my_platform
Common Arguments
job_conf: dict
The Cylc job configuration as a dictionary with the following fields:
dependencies
directives
env-script
environment
err-script
execution_time_limit
exit-script
flow_nums
init-script
job_d
job_file_path
job_runner_command_template
job_runner_name
namespace_hierarchy
param_var
platform
post-script
pre-script
script
submit_num
task_id
try_num
uuid_str
work_d
workflow_name
submit_opts: dict
The Cylc job submission options as a dictionary which may contain the following fields:
env
execution_time_limit
execution_time_limit
job_runner_cmd_tmpl
job_runner_cmd_tmpl
An Example
The following
qsub.py
module overrides the built-in pbs job runner handler to change the directive prefix from#PBS
to#QSUB
:#!/usr/bin/env python3 from cylc.flow.job_runner_handlers.pbs import PBSHandler class QSUBHandler(PBSHandler): DIRECTIVE_PREFIX = "#QSUB " JOB_RUNNER_HANDLER = QSUBHandler()
If this is in the Python search path (see Installation above) you can use it by name in your global configuration:
[platforms] [[my_platform]] hosts = myhostA, myhostB job runner = qsub # <---!
Then in your
flow.cylc
file you can use this platform:# Note, this workflow will fail at run time because we only changed the # directive format, and PBS does not accept ``#QSUB`` directives in # reality. [scheduling] [[graph]] R1 = "a" [runtime] [[root]] execution time limit = PT1M platform = my_platform [[[directives]]] -l nodes = 1 -q = long -V =
Note
Don’t subclass this class as it provides optional interfaces which you may not want to inherit.
- FAIL_SIGNALS: Tuple[str]
A tuple containing the names of signals to trap for reporting errors.
The default is
("EXIT", "ERR", "TERM", "XCPU")
.ERR
andEXIT
are always recommended.EXIT
is used to report premature stopping of the job script, and its trap is unset at the end of the script.
- KILL_CMD_TMPL: str
Command template for killing a job submission.
A Python string template for getting the job runner command to remove and terminate a job ID. The command is formed using the logic:
job_runner.KILL_CMD_TMPL % {"job_id": job_id}
.For info on Python string template format see: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
- POLL_CANT_CONNECT_ERR: str
String for detecting communication errors in poll command output.
A string containing an error message. If this is defined, when a poll command returns a non-zero return code and its STDERR contains this string, then the poll result will not be trusted, because it is assumed that the job runner is currently unavailable. Jobs submitted to the job runner will be assumed OK until we are able to connect to the job runner again.
- POLL_CMD: str
Command for checking job submissions.
A list of job IDs to poll will be provided as arguments.
The command should write valid submitted/running job IDs to stdout.
To filter out invalid/failed jobs use
ExampleHandler.filter_poll_many_output()
.To build a more advanced command than is possible with this configuration use
ExampleHandler.get_poll_many_cmd()
.
- REC_ID_FROM_SUBMIT_ERR: re.Pattern
Regular expression to extract job IDs from submission stderr.
- REC_ID_FROM_SUBMIT_OUT: re.Pattern
Regular expression to extract job IDs from submission stderr.
A regular expression (compiled) to extract the job “id” from the standard output or standard error of the job submission command.
- SHOULD_KILL_PROC_GROUP: bool
Kill jobs by killing the process group.
A boolean to indicate whether it is necessary to kill a job by sending a signal to its Unix process group. This boolean also indicates that a job submitted via this job runner will physically run on the same host it is submitted to.
- SHOULD_POLL_PROC_GROUP: bool
Poll jobs by PID.
A boolean to indicate whether it is necessary to poll a job by its PID as well as the job ID.
- SUBMIT_CMD_ENV: Iterable[str]
Extra environment variables for the job runner command.
A Python dict (or an iterable that can be used to update a dict) containing extra environment variables for getting the job runner command to submit a job file.
- SUBMIT_CMD_TMPL: str
Command template for job submission.
A Python string template for getting the job runner command to submit a job file. The command is formed using the logic:
job_runner.SUBMIT_CMD_TMPL % {"job": job_file_path}
For info on Python string template format see: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
- filter_poll_many_output(out)[source]
Filter job ides out of poll output.
Called after the job runner’s poll command. The method should read the output and return a list of job IDs that are still in the job runner.
- filter_submit_output(out, err)[source]
Filter job submission stdout/err.
Filter the standard output and standard error of the job submission command. This is useful if the job submission command returns information that should just be ignored.
See also
ExampleHandler.SUBMIT_CMD_TMPL()
.
- format_directives(job_conf)[source]
Returns lines to be appended to the job script.
This method formats the job directives for a job file, if job file directives are relevant for the job runner. The argument “job_conf” is a dict containing the job configuration.
- get_poll_many_cmd(job_id_list)[source]
Return a command to poll the specified jobs.
If specified, this will be called instead of
ExampleHandler.POLL_CMD
.
- get_submit_stdin(job_file_path, submit_opts)[source]
Return a 2-element tuple
(proc_stdin_arg, proc_stdin_value)
.Element 1 is suitable for the
stdin=...
argument ofsubprocess.Popen
so it can be a file handle,subprocess.PIPE
orNone
.Element 2 is the string content to pipe to stdin of the submit command (relevant only if
proc_stdin_arg
issubprocess.PIPE
.
- get_vacation_signal(job_conf)[source]
Return the vacation signal.
If relevant, return a string containing the name of the signal that indicates the job has been vacated by the job runner.
- submit(job_file_path, submit_opts)[source]
Submit a job.
Submit a job and return an instance of the Popen object for the submission. This method is useful if the job submission requires logic beyond just running a system or shell command.
See also
ExampleHandler.SUBMIT_CMD_TMPL
.You must pass “env=submit_opts.get(‘env’)” to Popen - see
cylc.flow.job_runner_handlers.background
for an example.- Parameters:
- Returns:
(ret_code, out, err)
- ret_code:
Subprocess return code.
- out:
Subprocess standard output, note this should be newline terminated.
- err:
Subprocess standard error.
- Return type: