Tag-based scheduling
This blueprint introduces the idea of attaching a set of strings called tags to tasks and workers, and using them during scheduling to match pending tasks with the workers that can execute them.
This is limited to addressing the problem of scheduling tasks. Tags can be used for other purposes, like matching tasks to task configuration entries being applied to them, but uses other than scheduling will be considered outside the scope of this blueprint.
The core of the idea is that tasks can provide a set of tags for workers, and can require that workers have a set of tags. Conversely, workers can provide a set of tags for tasks, and can require that tasks have a set of tags.
Once the sets of provided and required tags are stored in the database in
WorkRequest and Worker instances, finding the list of suitable workers
for a task at scheduling time can be done with two set containment operations:
the worker-required tags need to be a subset of the task-provided tags, and
the task-required tags need to be a subset of the worker-provided tags.
This design shifts the complexity away from the scheduler, and towards the process of building the worker/task provided/required tag sets. Therefore, this blueprint contains a number of use cases, and for each of them it drafts a possible tag ontology, and a process for building the provided / required tag sets, that can address the use case.
General assumptions
We assume that tag values are namespaced by prefixes followed by a colon
(:), and that tags with a similar prefix share some amount of homogeneity
and semantics. A set of tags with the same prefix is called an ontology.
We assume that security sensitive operations are adding provided tags and removing required tags, as any attack removing provided tags or adding required tags may only have the effect of making a task harder to schedule, by reducing the number of candidate workers it can use.
The security problem of removing required tags from an existing tag set only becomes relevant if we design mechanisms to do that, which is not something that we are considering at this stage.
If we are going to need complex rules, like full boolean expressions for tag
derivation, they can be applied when building the tag set, before the work
request is in PENDING state. Once the work request is PENDING, the sets of
task-provided and task-required tags are final and cannot be changed anymore.
This makes space for complex use cases while keeping complexity away from the
scheduler. This also allows to build database indexes of tagsets specific to
work requests in PENDING state, for use by the scheduler.
At least the final task-provided tag set for a task will be preserved for the rest of the lifetime of the task, and possibly the task-required tags as well. Besides helping with debugging, task tags are likely to be useful also for other use cases, such as browsing tasks in the UI.
Use cases
Replace BaseTask.can_run_on
BaseTask.can_run_on currently needs to execute code on every potential
task/worker match, and makes the scheduler inefficient.
Its current implementations ensure that:
a task is executed only in a worker that can execute that task type, name and version;
the task build architecture, if present, is supported by the worker’s system architectures
a worker has an executor backend of the type a task may require
a worker has a command installed that the task needs (
autopkgtest,mmdebstrap,sbuild)
Potential tag-based solution:
Have worker-provided tags corresponding to significant system capabilities (for example: type of worker, catalog of executable tasks, relevant installed packages, available executor backends, supported build architectures).
Tasks can then analyze their requirements when building dynamic task data, and build a list of matching task-required tags.
This effectively moves what can_run_on is doing from the scheduling phase
to the earlier dynamic task data computation phase.
Security boundaries:
There are no significant security issues in this use case: a wrong scheduling decision would produce a task that cannot be executed and ends with an error.
Some packages are only built on “large” builders
Packages like the linux kernel or libreoffice may be too large to be run by any worker, so we need to be able to limit the candidate workers of “large packages” to “large workers”.
Potential tag-based solution:
Use worker-provided tags (for example:
worker:class) to classify selected workers with
worker:class:large.
Use the task configuration system to add a worker:class:large task-required
tag.
Alternatively, if the source package name gets encoded in a task-provided tag
(for example: task:source-package), a tag
inference rule could add the worker:class:large task-required tag for all
tasks that have task:source-package:libreoffice or task:source-package:linux.
Security boundaries:
There are no significant security issues in this use case: a wrong scheduling decision would produce a task that fails due to limited resources and ends with an error.
Official Debian packages need to be built on official Debian workers
Not all builds in the Debian scope are targeted at Debian.
Example of tasks building packages targeted at Debian:
Debian uploads
Uploads to workspaces that are staging repositories for transitions aimed at Debian
Generating build environments for Debian uploads
Examples of tasks building packages not targeted at Debian:
Uploads to workspaces representing personal repositories
Uploads to workspaces representing experiments
Experiment tasks could run on any workers, including dynamically provisioned cloud ones, while uploads targeted at Debian need to be built on official Debian workers for their binaries to be accepted into the distribution without the need for a rebuild.
Potential tag-based solution:
This can be handled by official Debian workers being configured by the administrator to provide a worker:class tag identifying them as such. Tasks that are intended to target Debian can then require that tag.
The tag could be set either by the workflow, if the intention is to be set on a task-by-task basis, or by the workspace via the task configuration mechanism, if the intention is to be set for an entire workspace.
Security boundaries:
This kind of task-directed worker selection does not have security issues, provided that the next use case is in place.
If a task is wrongly tagged as requiring an official Debian worker, but it has been created by a user that cannot use official Debian workers, the worst that can happen is that it will never get scheduled.
Note
A possible way of tracking that an artifact has been built on an official Debian worker is part of #404.
Official Debian workers only build packages uploaded by Debian developers or maintainers
Debian is a complex ecosystem, and people who are not Debian Developers or Debian Maintainers may eventually be allowed into the Debian scope, for example to support package sponsorship, mentoring, or extending personal repositories to non-DD and non-DM people.
This could however open the risk of arbitrary code execution by non-vetted people on workers. We could have workers set up where we design for that risk, but we need to protect official Debian workers from a potential vector of supply-chain attacks.
Potential tag-based solution:
When building the task-provided tag set, Debusine can enumerate the groups that
have the user in created_by as a member, and add a
task:group tag for each of them.
Official Debian workers can then require tasks to have the
task:group:debian::Debian tag, to ensure they are running code from Debian
Developers.
If however Debian workers can execute code from both Debian Developers and
Debian Maintainers, this is not sufficient, as workers would need to be
restricted to tasks having either task:group:debian::Debian or
task:group:debian::Maintainers.
That can be addressed with a task:class tag like
debian-official, that is assigned to users who are members of either the
Debian or the Maintainers group.
Security boundaries:
This use case has security issues: being able to arbitrarily provide the task tags this relies on, would allow untrusted users to gain arbitrary code execution on trusted Debian workers.
For a solution that relies on task:group tags,
when building the final task-provided tag set we can ensure that the
task:group tags are only provided by a database query.
For a solution that relies on task:class tags, we need to have a way that restricts providing the relevant task class only to members of the given groups, or a way for scope admins to set up scope-specific tag inference rules. Those would require groups or scopes to be encoded as provenances, to be able to configure the restriction.
For this blueprint we are relying on a vetted user not running builds with untrusted code. For example, we are not providing a way to defend against an official developer scheduling a build using untrusted extra artifacts. There would be a case of “official Debian workers only build using artifacts built by official Debian workers”, but it would be a case for a different design iteration, more related to SBOM tracking than scheduler design.
Have workers that are dedicated to specific scopes / workspaces
In a SaaS situation, we may need to allocate a pool of workers to a given scope, in a way that tasks in that scope are only run in that dedicated set of workers, and that dedicated set of workers only runs tasks for that scope.
This would allow to respect things like confidentiality requirements and service level agreements.
Potential tag-based solution:
To make sure that dedicated workers execute only the tasks of their assigned scopes:
As long as partitioning works along one, and only one scope, tasks can provide a task:scope tag, and workers can be configured to require it.
If a worker could be dedicated to multiple scopes, for example in case Freexian wanted to dedicate a single pool of workers to all SaaS customers of a given tier, then we can use an administrator-defined classification system (see task:class), classify tasks by tier name in all the relevant SaaS scopes, and have workers require that class.
To make sure tasks are only executed in their dedicated workers, workers can be classified by the admin (for example with worker:class) as dedicated to that partition. All workers in the relevant scopes can then require the worker to provide that class tag.
Security boundaries:
This case has security issues on both task-provided and worker-provided tags.
If a task can provide the tag identifying it as an arbitrary scope or class, then it can execute arbitrary code on a restricted worker.
If a worker can wrongly provide the tag identifying it as a dedicated worker for a given scope or class, then it can get assigned tasks that it should not see, breaking confidentiality agreements.
In the case of task-provided scope tags, we can enforce, when building the final task-provided tag set, that the task:scope tag matches the current scope.
In the case of arbitrary task class tags (like task:class), we need a way to configure specific tag values to be usable only in a restricted set of scopes.
Building tag sets for scheduling
Building tag sets for scheduling is an incremental operation, where we start with empty tag sets and different sources contribute tags to add to them.
4 tag sets need to be built for each scheduling operation:
task-provided tags
task-required tags
worker-provided tags
worker-required tags
In most cases the final tag set is the simple union of the sets of tags provided by all relevant data sources.
For tag sets that may contain tags that are sensitive for security, there needs to be some level of validation before a tag is allowed to be added to the set.
To support more complex use cases, a set of tag derivation functions may be applied to tag sets before finalization, where some tags are added depending on boolean expressions evaluated on the presence of other tags.
Tag merging, validation and derivation
The backend for building a tag set can be a system that:
starts with an empty tag set
provides a way for merging another tag set, specifying its provenance
provides a way to finalize the tag set, possibly applying tag derivation expressions
The system should be configurable, possibly via the Django settings system, with:
an allow list of provenance restrictions associated to a list of tags or tag prefix matching expressions, specifying that any such tags will be removed from all tag sets that come from a provenance not in the allow list
a set of boolean expressions and associated tag sets, to derive extra tags during finalization
When a task is about to move to the PENDING state, or when a worker
updates its dynamic metadata, the system will gather the relevant merge
candidate tag sets, the relevant backends are instantiated, and it is used to
build the final tag set used for scheduling.
It makes sense for Debusine to not start with an empty configuration, but to
have a built-in hardcoded set of initial restrictions that cover the
system-provided tags, like restricting task:group:* tags to come only from
the system provenance.
Draft tag ontologies
This chapter collects the draft tag ontologies that have been mentioned in the blueprint when covering each different use case.
Worker-provided tag ontologies
worker:system
Describe the worker capabilities.
worker:system:worker_type:{worker_type}: type of worker, encoding the type of tasks that this worker can execute. This is provided by the database field Worker.worker_type and cannot be overridden.
It could be that this ontology could be merged into worker:capability.
worker:worker, worker:server, worker:signing
Describe the tasks that the workers can execute:
worker:{task_type}:{task_name}:version:{version}: the worker can execute the task with the given type, name and version. This is provided by dynamic worker metadata and cannot be overridden.
worker:capability
Describe relevant capabilities available in the worker.
worker:capability:{name}: the worker has the named capability.
name could be the name of an installed package, the name of a suite of
packages providing a significant feature, or the name of an available hardware
feature.
Examples of relevant packages: autopkgtest, mmdebstrap, sbuild,
qemu, dev-kvm, gpu, and so on.
worker:executor
Describe the executor backends available in the worker.
worker:executor:{name}: the given executor is available for use in the worker.
name can take values from the debusine.tasks.models.BackendType enum
except for auto.
Example executor names: unshare, incus-lxc, incus-vm and qemu.
worker:build-arch
Describe the build architectures supported in the worker.
worker:build-arch:{name}: the worker supports the given build architecture.
name is an architecture name as used in Debian, like amd64, arm64,
ppc64el, and so on.
worker:class
Assign one or more arbitrary classes to workers.
worker:class:{name}: the worker has been assigned the given class.
name is a classification system arbitrarily defined by the Debusine administrators.
Possible examples: small, large, trusted, official,
experimental, and so on.
Task-provided tag ontologies
task:scope
Describe the scope that contains the task.
task:scope:{scope_name}: the task is in the given scope.
task:workspace
Describe the workspace that contains the task.
task:workspace:{scope_name}:{workspace_name}: the task is in the given workspace.
task:group
Declare that the user who created the task is a member of a group.
task:group:{scope_name}:{workspace_name}:{group_name}: the user is a member of this group.
scope_name and group_name correspond to the scope and name of the group.
Given that a group is only optionally assigned to a workspace,
workspace_name corresponds to the workspace name of the group, when
present, or to the empty string otherwise. A tag for a non-workspaced group
could then be, for example: task:group:debian::Debian.
task:source-package
Name of the source package for this task.
task:source-package:{name}: name of the source package for the task.
name corresponds to a source package name.
task:class
Assign one or more arbitrary classes to tasks.
task:class:{name}: the task has been assigned the given class.
name is a classification system arbitrarily defined by the Debusine administrators.
Possible examples: embargoed, security, urgent, official,
experimental, trusted, untrusted, and so on.
Postponed elements
Supersede subject and runtime context
Task tags also have potential to be used to supersede the subject and runtime context task metadata, and as such in matching tasks for task configuration.
Scheduling preferences
Tags could also be used to express scheduling preferences, for example to allow “small tasks” to prefer “small workers”, to keep large workers available for bigger tasks while at the same time using large workers for small tasks when all other workers are full.
The current scope of this blueprint is how to find suitable workers, and scheduling preferences would be a problem of ranking suitable workers, which looks like a different problem to design for.
It could very well be that the ranking problem could be solved by introducing a set of task-preferred tags, and scheduling could sort available and suitable workers by the decreasing number of task-preferred tags that they have.
It could also be that, by the time we get to the need of ranking suitable workers, we gathered enough experience with task execution statistics and more elaborate scheduling corner cases to come up with entirely different ideas.
Determining demand for worker pool provisioning
Tags may be used to determine suitability of using a worker pool to scale capacity to process a growing list of pending work requests.
One way to do that, is to have the worker pool contain a preview of the full worker-provided tag list of the workers it can create, so it would be possible to check how many pending tasks could be handled by the worker pool.
However, to keep the scope of this blueprint contained, this would be a discussion to be handled in a further design iteration.