Task types

There are six types of tasks, each with its own peculiarities. But they are all scheduled through work requests. That’s why work requests identify the precise task to execute with a combination of a task_type and a task_name value.

Worker tasks

Worker tasks is the most common type of tasks. They run on external workers, often within some controlled execution environments. They may execute untrusted code, such as building a source package uploaded by a user.

Worker tasks can only interact with debusine through the public API. Each worker has a dedicated token that has the proper permissions to retrieve the required artifacts and to upload the generated artifacts.

Worker tasks can require specific features from the workers on which they will run. This can be used to ensure that the assigned worker:

  • supports some specific architecture (when managing builders with different architectures)

  • has enough memory

  • has enough disk space

  • has some specific executables

  • etc.

Execution environment

Debusine supports multiple different virtualisation backends to execute Worker tasks, from lightweight containers (e.g. unshare) to VMs (e.g. incus-vm). Some tasks let you control the virtualisation backend to use through the backend parameter (in task_data).

The currently supported backends are:

  • unshare: lightweight container built with /usr/bin/unshare

  • incus-lxc: LXC container managed by Incus

  • qemu: QEMU virtual machine

  • incus-vm: QEMU virtual machine managed by Incus

When tasks are executed in an executor backend, one of the task inputs is an environment, an artifact containing a system image that the task is executed in. These image artifacts are downloaded by the worker and cached locally. For some backends (e.g. Incus) they’ll be converted and/or imported into an image store.

The worker maintains an LRU cache of up to 10 images. When cleaning up images, they’ll also be removed from any relevant image stores.

Server tasks

Server tasks perform operations that require direct database access and that may take some time to run. They run on Celery workers, and must not execute any user-controlled code.

Since server tasks have database access, they can thus analyze their parent workflow, including all the completed work requests and the generated artifacts, they can also consume and generate runtime data that will be available for other steps in the workflow (through the internal collection associated with the workflow’s root WorkRequest).

Internal tasks

Internal tasks are used to structure workflows and represent operations that are typically handled by the debusine scheduler itself. There are only two internal tasks currently:

Synchronization points

Internal tasks with task_name set to synchronization_point are tasks that do nothing. Hence they also don’t need any input data and their associated task_data in a work request is an empty dictionary.

Their main use is to provide synchronization points in a graph of blocked work requests. In particular they can be used used to represent the entry or exit points of sub-workflows or of groups of related work requests.

When such a work request becomes pending, it is immediately marked as completed by the scheduler, thus unblocking work requests that depend on it.

This work request typically has a non-empty workflow_data explaining its purpose and influencing the rendering of the workflow’s visual representation.

Workflow callbacks

Internal tasks with task_name set to workflow are integrated in strategic points of a workflow’s graph of work requests to ask the scheduler to re-run the workflow orchestrator when that work request becomes executable.

Note

In a work request, its associated task_data is an empty dictionary but the workflow_data dictionary must have a step key to identify the callback being executed.

The workflow orchestrator to run is identified by following the parent relationship and looking up the task_name (since the parent work request should be a workflow).

This gives the workflow orchestrator an opportunity to review the progress of the workflow and to add additional work requests (or alter the structure of the workflow) based on results of already-completed work requests.

The orchestrator is run in a celery task and the associated internal work request is marked as completed when the celery task has completed.

Workflow tasks

Workflow tasks represent a collection of other tasks; see Workflows.

Signing tasks

Signing tasks are like Worker tasks, but run on restricted signing workers. They typically interact with secret keys that required to perform the requested operation.

Wait tasks

Wait tasks represent steps in workflows where debusine needs to wait until something else happens (user interaction or some other part of debusine). The task name determines what debusine is waiting for.

Completing such tasks can either involve an API call performed by some user, or some sort of regular job that monitors something and marks the task as completed when some criteria are met.

The needs_input field in a Wait task’s workflow data is True if it requires user input, otherwise False.