Coordinating sub-workflows

The design of workflows allows for multiple sub-workflows to be composed into a higher-level workflow. This is an important and powerful feature, but there are many subtleties involved.

General scheme

Cooperation between workflows is defined at the level of workflows. Individual work requests should not concern themselves with this; they should be designed to take inputs using lookups and produce output artifacts that are linked to the work request.

Sub-workflow coordination takes place through the workflow’s internal collection (which is shared among all sub-workflows of the same root workflow), providing a mechanism for some work requests to declare that they will provide certain kinds of artifacts which may then be required by work requests in other sub-workflows.

On the providing side, workflows use the update-collection-with-artifacts event reaction to add relevant output artifacts from work requests to the internal collection, and create promises to indicate to other workflows that they have done so. Providing workflows choose item names in the internal collection; it is the responsibility of workflow designers to ensure that they do not clash, and workflows that provide output artifacts have a optional prefix field in their task data to allow multiple instances of the same workflow to cooperate under the same root workflow.

On the requiring side, workflows look up the names of artifacts they require in the internal collection; each of those lookups may return nothing, or a promise including a work request ID, or an artifact that already exists, and they may use that to determine which child work requests they create. They use lookups in their child work requests to refer to items in the internal collection (e.g. internal@collections/name:build-amd64), and add corresponding dependencies on work requests that promise to provide those items.

The Workflow (or WorkRequest, if appropriate) class gains helper methods for use by workflow orchestrators to declare that a work request provides output (adding an event reaction and a promise) and to declare that a work request requires input (adding any necessary dependencies based on its task data). These help to make both the providing and requiring sides work in uniform ways.

Orchestration changes

The populate method of sub-workflows is no longer called directly by the scheduler. Instead, the root workflow’s populate method calls populate on any sub-workflows it creates.

This means that sub-workflows may depend on other steps within the root workflow while still being fully populated in advance of being able to run. A workflow that needs more information before being able to populate child work requests should use workflow callbacks to run the workflow orchestrator again when it is ready. (For example, a workflow that creates a source package and then builds it may not know which work requests it needs to create until it has created the source package and can look at its Architecture field.)

Category debusine:promise

This bare data item represents an artifact that will eventually be provided as the output of some existing work request. The bare data item may contain additional properties of said artifact to help differentiate them. These items are created by workflows when they populate their work request graphs.

Workflows add promises to their internal collection using the same name as they use for the expected artifact.

  • Data:

    • promise_work_request_id (integer, required): the ID of the work request that is expected to fulfil this promise

    • promise_workflow_id (integer, required): the ID of the workflow work request that created this promise

    • promise_category (string, required): the category of the artifact that is expected to be provided

Other data items should match the per-item data structure used when adding the promised artifact to the workflow’s internal collection, which is determined by the parent workflow based on its protocol for communicating with other workflows. This allows constructing lookups that match either a promise or the promised artifact. Names starting with promise_ are reserved.

When a work request is retried, it must update the work request ID in any associated promises.

debusine:workflow-internal collection changes

The debusine:workflow-internal collection may now have per-item data, whose structure is defined by workflows using the update-collection-with-artifacts or update-collection-with-data event reactions. The variables or data fields respectively are copied into per-item data. Names starting with promise_ are reserved.

This allows matching promises or promised artifacts using workflow-defined criteria.