Task configuration

When you maintain a complete distribution like Debian or one of its derivatives, you have to deal with special cases and exceptions, for example:

  • disable build/autopkgtest/etc. of a package on a specific architecture because it kills the workers

  • restrict the build/autopkgtest/etc. of a package to specific workers where the build is known to succeed

  • etc.

As a derivative, you might want to make opinionated choices and change some of the build parameters by using a specific build profile on some packages.

Those tweaks and exceptions are recorded in a debusine:task-configuration collection, and then later used to feed the relevant workflows and work requests.

This collection is meant to store configuration data for tasks, as “key/value pairs” that are going to be fed in the task_data field of Work requests. Tasks can be any type of Tasks, but for all practical purposes, Worker and Workflow tasks are the most likely target for configuration.

The final configured task data is generated by merging multiple snippets of configuration, each stored in its debusine:task-configuration entry, and each applying at different levels of granularity.

Looking up task configuration entries for a task

To provide fine-grained control of the configuration, we consider that a subject is being processed by a task and that the task can have a configuration context. The configuration context is typically another parameter of the task that can usefully be leveraged to apply some consistent configuration across all work requests sharing the same configuration context.

Todo

When we will be able to use tags for matching task configuration entries we will effectively have the equivalent of supporting multiple configuration context entries, and possibly we will replace configuration context completely.

Those two values are used to lookup the various snippets of configuration. The snippets are retrieved and processed in the following orders:

  • global (subject=None, context=None)

  • context level (subject=None, context != None)

  • subject level (subject != None, context=None)

  • specific-combination level (subject != None, context != None)

The collection can host partial or full configuration data. But it is expected to be mainly useful to store overrides, i.e. variations compared the defaults provided by the task or its containing workflow.

For example, for the debian-pipeline workflow, subject would typically be the source package name while context would be the name of the target suite.

About templates

Template entries follow the same structure as other entries, but they are only used indirectly, when a normal configuration entry refers to them as part of its use_templates field.

It is meant to share some common configuration across multiple similar packages.

Example:

template:uefi-sign:
  default_values:
    enable_make_signed_source: True
    make_signed_source_purpose: uefi

template:uefi-sign-with-fwupd-key:
  use_templates:
    - uefi-sign
  default_values:
    make_signed_source_key: AEC1234

template:uefi-sign-with-grub-key:
  use_templates:
    - uefi-sign
  default_values:
    make_signed_source_key: CBD3214

Workflow:debian-pipeline:fwupd-efi::
  use_templates:
    - sign-with-fwupd-key

Workflow:debian-pipeline:fwupdate::
  use_templates:
    - sign-with-fwupd-key

Workflow:debian-pipeline:grub2::
  use_templates:
    - sign-with-grub-key

Reducing workflow complexity

Having the ability to store overrides at the worker task level saves us from adding too many configuration parameters on the workflows, so that the only required parameters are those that are important to control the orchestration step.

For example, we can have configuration for the sbuild worker task next to the configuration for the debian-pipeline workflow:

Workflow:debian-pipeline:::
  default_values:
    ...

Worker:sbuild::stretch:
  override_values:
    backend: incus-lxc

This shows how the sbuild_backend parameter might no longer be a needed input for the debian-pipeline workflow, though it is still available.

Integration with tasks

To be able to apply changes to the submitted task_data configuration, we need to be able to know the subject and the context, which may depend on information not available when the task is created. For example, the subject may be derived from an artifact that is the output of a previous work request in a workflow.

Task configuration can thus be applied only when a task becomes pending, and subject and context are generated at that time using the task’s debusine.db.models.tasks.DBTask.get_task_configuration_subject_context() method.

Algorithm to apply the configuration

  • First build a single “configuration entry” by combining all the relevant collection items. All items are processed in the correct order, integrating the items referenced from use_templates just before the corresponding item, by doing the following operations:

    default_values = dict()
    override_values = dict()
    locked_values = set()
    
    for config_item in all_items:
        # Drop all the entries referenced in `delete_values` (except
        # locked values)
        for key in config_item.delete_values:
            if key in locked_values:
                continue
            del default_values[key]
            del override_values[key]
    
        # Merge the default/override values in the response
        # (except locked values)
        for key, value in config_item.default_values:
            if key in locked_values:
                continue
            default_values[key] = value
        for key, value in config_item.override_values:
            if key in locked_values:
                continue
            override_values[key] = value
    
        # Update the set of locked values
        locked_values.update(config_item.lock_values)
    
    return (default_values, override_values)
    
  • The operations of that single combined-entry are then applied to the data available in task_data:

    new_task_data = task_data.copy()
    default_values, override_values = get_merged_task_configuration()
    
    # Apply default values (add missing values, but also replace explicit
    # None values)
    for k, v in default_values:
        if new_task_data.get(k) is None:
            new_task_data[k] = v
    
    # Apply overrides
    new_task_data.update(override_values)
    
  • The result is stored in WorkRequest.configured_task_data, which will be used from that point on as the task’s data, while WorkRequest.task_data remains untouched as documentation for the initial task input.