The reason for spawn-on-submit and run-time dependency matching is historical: Cylc originally had no workflow definition and no graph, just a loose collection of separate task definitions with their own prerequisites and outputs - it was up to Cylc to make the connections.
Each task also knew its own cycling sequence (e.g. the forecast model should run on a 6 hour cycle…), so it seemed natural that a task should spawn its own next-cycle successor.
When I bolted the “new” suite.rc and graph on the front (Cylc 3?) I used the graph only to determine automatic prerequisites and outputs for each task, and just passed these to the original “self-organising scheduler”. But this dynamic dependency matching is technically unnecessary even in the current system, because the graph knows who satisfies whose prerequisites.
At start-up we instantiate the first cycle-point instance of every task in the workflow, then each spawns its own next-cycle successor at job submit time (which prevents uncontrolled spawning but still allows successive instances to run in parallel). This works remarkably well (at ensuring task proxies exist before they are needed) BUT: