Task types
There are six types of tasks, each with its own peculiarities. But
they are all scheduled through work requests. That’s why work requests
identify the precise task to execute with a combination of a task_type
and a task_name
value.
Worker tasks
Worker
tasks is the most common type of tasks. They run on external
workers, often within some controlled execution environments. They may
execute untrusted code, such as building a source package uploaded by a
user.
Worker tasks can only interact with debusine through the public API. Each worker has a dedicated token that has the proper permissions to retrieve the required artifacts and to upload the generated artifacts.
Worker tasks can require specific features from the workers on which they will run. This can be used to ensure that the assigned worker:
supports some specific architecture (when managing builders with different architectures)
has enough memory
has enough disk space
has some specific executables
etc.
Execution environment
Debusine supports multiple different virtualisation backends to execute
Worker
tasks, from lightweight containers (e.g. unshare
) to VMs
(e.g. incus-vm
). Some tasks let you control the virtualisation
backend to use through the backend
parameter (in task_data
).
The currently supported backends are:
unshare
: lightweight container built with /usr/bin/unshareincus-lxc
: LXC container managed by Incusqemu
: QEMU virtual machineincus-vm
: QEMU virtual machine managed by Incus
When tasks are executed in an executor backend, one of the task inputs is an environment, an artifact containing a system image that the task is executed in. These image artifacts are downloaded by the worker and cached locally. For some backends (e.g. Incus) they’ll be converted and/or imported into an image store.
The worker maintains an LRU cache of up to 10 images. When cleaning up images, they’ll also be removed from any relevant image stores.
Server tasks
Server
tasks perform operations that require direct database access
and that may take some time to run. They run on Celery workers, and must
not execute any user-controlled code.
Since server tasks have database access, they can thus analyze their
parent workflow, including all the completed work requests and the
generated artifacts, they can also consume and generate runtime data
that will be available for other steps in the workflow (through the
internal collection associated with the workflow’s root
WorkRequest
).
Internal tasks
Internal
tasks are used to structure workflows and represent
operations that are typically handled by the debusine scheduler
itself. There are only two internal tasks currently:
Synchronization points
Internal tasks with task_name
set to synchronization_point
are
tasks that do nothing. Hence they also don’t need any input data
and their associated task_data
in a work request is an empty
dictionary.
Their main use is to provide synchronization points in a graph of blocked work requests. In particular they can be used used to represent the entry or exit points of sub-workflows or of groups of related work requests.
When such a work request becomes pending, it is immediately marked as completed by the scheduler, thus unblocking work requests that depend on it.
This work request typically has a non-empty workflow_data
explaining
its purpose and influencing the rendering of the workflow’s visual
representation.
Workflow callbacks
Internal tasks with task_name
set to workflow
are integrated
in strategic points of a workflow’s graph of work requests to ask the
scheduler to re-run the workflow orchestrator when that work request
becomes executable.
Note
In a work request, its associated task_data
is an empty dictionary
but the workflow_data
dictionary must have a step
key to
identify the callback being executed.
The workflow orchestrator to run is identified by following
the parent
relationship and looking up the task_name
(since the parent work request should be a workflow).
This gives the workflow orchestrator an opportunity to review the progress of the workflow and to add additional work requests (or alter the structure of the workflow) based on results of already-completed work requests.
The orchestrator is run in a celery task and the associated internal work request is marked as completed when the celery task has completed.
Workflow tasks
Workflow
tasks represent a collection of other tasks; see
Workflows.
Signing tasks
Signing
tasks are like Worker
tasks, but run on restricted signing
workers. They typically interact with secret keys that required to perform
the requested operation.
Wait tasks
Wait
tasks represent steps in workflows where debusine needs to wait
until something else happens (user interaction or some other part of
debusine). The task name determines what debusine is waiting for.
Completing such tasks can either involve an API call performed by some user, or some sort of regular job that monitors something and marks the task as completed when some criteria are met.
The needs_input
field in a Wait
task’s workflow data is True if it
requires user input, otherwise False.