Advanced
Scheduler Signals
The Cylc scheduler will shutdown gracefully on receipt of any of the following signals:
SIGINT
SIGTERM
SIGHUP
The signal will cause the scheduler to shutdown in --now
mode.
If the scheduler is already shutting down in --now
mode, the signal will
escalate shutdown to --now --now
mode.
See cylc stop --help
for details on stop modes.
Handling Job Preemption
Some HPC facilities allow job preemption: the resource manager can kill or suspend running low priority jobs in order to make way for high priority jobs. The preempted jobs may then be automatically restarted by the resource manager, from the same point (if suspended) or requeued to run again from the start (if killed).
Suspended jobs will poll as still running (their job status file says they started running, and they still appear in the resource manager queue). Loadleveler jobs that are preempted by kill-and-requeue (“job vacation”) are automatically returned to the submitted state by Cylc. This is possible because Loadleveler sends the SIGUSR1 signal before SIGKILL for preemption. Other job runners just send SIGTERM before SIGKILL as normal, so Cylc cannot distinguish a preemption job kill from a normal job kill. After this the job will poll as failed (correctly, because it was killed, and the job status file records that).