Custom Job Submission Methods

class cylc.flow.job_runner_handlers.documentation.ExampleHandler

Documentation for writing job runner handlers.

Cylc can submit jobs to a number of different job runners (aka batch systems) e.g. Slurm and PBS. For a list of built-in integrations see Supported Job Submission Methods.

If the job runner you require is not on this list, Cylc provides a generic interface for writing your own integration.

Defining a new job runner handler requires a little Python programming. Use the built-in handlers (e.g. cylc.flow.job_runner_handlers.background) as examples.

Installation

Custom job runner handlers must be installed on workflow and job hosts in one of these locations:

  • under WORKFLOW-RUN-DIR/lib/python/

  • under CYLC-PATH/cylc/flow/job_runner_handlers/

  • or anywhere in $PYTHONPATH

Each module should export the symbol JOB_RUNNER_HANDLER for the singleton instance that implements the job system handler logic e.g:

my_handler.py
class MyHandler():
    pass

JOB_RUNNER_HANDLER = MyHandler()

Each job runner handler class should instantiate with no argument.

Usage

You can then define a Cylc platform using the handler:

global.cylc
[platforms]
    [[my_platform]]
        job runner = my_handler  # note matches Python module name
        hosts = localhost

And configure tasks to submit to it:

flow.cylc
[runtime]
    [[my_task]]
        script = echo "Hello World!"
        platform = my_platform

Common Arguments

job_conf: dict

The Cylc job configuration as a dictionary with the following fields:

  • dependencies

  • directives

  • env-script

  • environment

  • err-script

  • execution_time_limit

  • exit-script

  • flow_nums

  • init-script

  • job_d

  • job_file_path

  • job_runner_command_template

  • job_runner_name

  • logfiles

  • namespace_hierarchy

  • param_var

  • platform

  • post-script

  • pre-script

  • script

  • submit_num

  • task_id

  • try_num

  • uuid_str

  • work_d

  • workflow_name

submit_opts: dict

The Cylc job submission options as a dictionary which may contain the following fields:

  • env

  • execution_time_limit

  • execution_time_limit

  • job_runner_cmd_tmpl

  • job_runner_cmd_tmpl

An Example

The following qsub.py module overrides the built-in pbs job runner handler to change the directive prefix from #PBS to #QSUB:

#!/usr/bin/env python3

from cylc.flow.job_runner_handlers.pbs import PBSHandler

class QSUBHandler(PBSHandler):
    DIRECTIVE_PREFIX = "#QSUB "

JOB_RUNNER_HANDLER = QSUBHandler()

If this is in the Python search path (see Installation above) you can use it by name in your global configuration:

[platforms]
    [[my_platform]]
        hosts = myhostA, myhostB
        job runner = qsub  # <---!

Then in your flow.cylc file you can use this platform:

# Note, this workflow will fail at run time because we only changed the
# directive format, and PBS does not accept ``#QSUB`` directives in
# reality.

[scheduling]
    [[graph]]
        R1 = "a"
[runtime]
    [[root]]
        execution time limit = PT1M
        platform = my_platform
        [[[directives]]]
            -l nodes = 1
            -q = long
            -V =

Note

Don’t subclass this class as it provides optional interfaces which you may not want to inherit.

FAIL_SIGNALS: Tuple[str]

A tuple containing the names of signals to trap for reporting errors.

The default is ("EXIT", "ERR", "TERM", "XCPU").

ERR and EXIT are always recommended. EXIT is used to report premature stopping of the job script, and its trap is unset at the end of the script.

KILL_CMD_TMPL: str

Command template for killing a job submission.

A Python string template for getting the job runner command to remove and terminate a job ID. The command is formed using the logic: job_runner.KILL_CMD_TMPL % {"job_id": job_id}.

For info on Python string template format see: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting

POLL_CANT_CONNECT_ERR: str

String for detecting communication errors in poll command output.

A string containing an error message. If this is defined, when a poll command returns a non-zero return code and its STDERR contains this string, then the poll result will not be trusted, because it is assumed that the job runner is currently unavailable. Jobs submitted to the job runner will be assumed OK until we are able to connect to the job runner again.

POLL_CMD: str

Command for checking job submissions.

A list of job IDs to poll will be provided as arguments.

The command should write valid submitted/running job IDs to stdout.

REC_ID_FROM_SUBMIT_ERR: Pattern

Regular expression to extract job IDs from submission stderr.

See ExampleHandler.REC_ID_FROM_SUBMIT_OUT.

REC_ID_FROM_SUBMIT_OUT: Pattern

Regular expression to extract job IDs from submission stderr.

A regular expression (compiled) to extract the job “id” from the standard output or standard error of the job submission command.

SHOULD_KILL_PROC_GROUP: bool

Kill jobs by killing the process group.

A boolean to indicate whether it is necessary to kill a job by sending a signal to its Unix process group. This boolean also indicates that a job submitted via this job runner will physically run on the same host it is submitted to.

SHOULD_POLL_PROC_GROUP: bool

Poll jobs by PID.

A boolean to indicate whether it is necessary to poll a job by its PID as well as the job ID.

SUBMIT_CMD_ENV: Iterable[str]

Extra environment variables for the job runner command.

A Python dict (or an iterable that can be used to update a dict) containing extra environment variables for getting the job runner command to submit a job file.

SUBMIT_CMD_TMPL: str

Command template for job submission.

A Python string template for getting the job runner command to submit a job file. The command is formed using the logic: job_runner.SUBMIT_CMD_TMPL % {"job": job_file_path}

For info on Python string template format see: https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting

filter_poll_many_output(out: str) List[str]

Filter job ides out of poll output.

Called after the job runner’s poll command. The method should read the output and return a list of job IDs that are still in the job runner.

Args:

out: Job poll stdout.

Returns:

List of job IDs

filter_submit_output(out: str, err: str) Tuple[str, str]

Filter job submission stdout/err.

Filter the standard output and standard error of the job submission command. This is useful if the job submission command returns information that should just be ignored.

See also ExampleHandler.SUBMIT_CMD_TMPL().

Args:

out: Job submit stdout. err: Job submit stderr.

Returns:

(new_out, new_err)

format_directives(job_conf: dict) List[str]

Returns lines to be appended to the job script.

This method formats the job directives for a job file, if job file directives are relevant for the job runner. The argument “job_conf” is a dict containing the job configuration.

Args:

job_conf: The Cylc configuration.

Returns:

lines

get_poll_many_cmd(job_id_list: List[str]) List[str]

Return a command to poll the specified jobs.

If specified, this will be called instead of ExampleHandler.POLL_CMD.

Args:

job_id_list: The list of job IDs to poll.

Returns:

command e.g. [‘foo’, ‘–bar’, ‘baz’]

get_submit_stdin(job_file_path: str, submit_opts: dict) Tuple

Return a 2-element tuple (proc_stdin_arg, proc_stdin_value).

  • Element 1 is suitable for the stdin=... argument of subprocess.Popen so it can be a file handle, subprocess.PIPE or None.

  • Element 2 is the string content to pipe to stdin of the submit command (relevant only if proc_stdin_arg is subprocess.PIPE.

Args:

job_file_path: The path to the job file for this submission. submit_opts: Job submission options.

Returns:

(proc_stdin_arg, proc_stdin_value)

get_vacation_signal(job_conf: dict) str

Return the vacation signal.

If relevant, return a string containing the name of the signal that indicates the job has been vacated by the job runner.

Args:

job_conf: The Cylc configuration.

Returns:

signal

manip_job_id(job_id: str) str

Modify the job ID that is returned by the job submit command.

Args:

job_id: The job ID returned by the submit command.

Returns:

job_id

submit(job_file_path: str, submit_opts: dict) Tuple[int, str, str]

Submit a job.

Submit a job and return an instance of the Popen object for the submission. This method is useful if the job submission requires logic beyond just running a system or shell command.

See also ExampleHandler.SUBMIT_CMD_TMPL.

You must pass “env=submit_opts.get(‘env’)” to Popen - see cylc.flow.job_runner_handlers.background for an example.

Args:

job_file_path: The job file for this submission. submit_opts: Job submission options.

Returns:

(ret_code, out, err)