Tag-based scheduling

This blueprint introduces the idea of attaching a set of strings called tags to tasks and workers, and using them during scheduling to match pending tasks with the workers that can execute them.

This is limited to addressing the problem of scheduling tasks. Tags can be used for other purposes, like matching tasks to task configuration entries being applied to them, but uses other than scheduling will be considered outside the scope of this blueprint.

The core of the idea is that tasks can provide a set of tags for workers, and can require that workers have a set of tags. Conversely, workers can provide a set of tags for tasks, and can require that tasks have a set of tags.

Once the sets of provided and required tags are stored in the database in WorkRequest and Worker instances, finding the list of suitable workers for a task at scheduling time can be done with two set containment operations: the worker-required tags need to be a subset of the task-provided tags, and the task-required tags need to be a subset of the worker-provided tags.

This design shifts the complexity away from the scheduler, and towards the process of building the worker/task provided/required tag sets. Therefore, this blueprint contains a number of use cases, and for each of them it drafts a possible tag ontology, and a process for building the provided / required tag sets, that can address the use case.

General assumptions

We assume that tag values are namespaced by prefixes followed by a colon (:), and that tags with a similar prefix share some amount of homogeneity and semantics. A set of tags with the same prefix is called an ontology.

We assume that security sensitive operations are adding provided tags and removing required tags, as any attack removing provided tags or adding required tags may only have the effect of making a task harder to schedule, by reducing the number of candidate workers it can use.

The security problem of removing required tags from an existing tag set only becomes relevant if we design mechanisms to do that, which is not something that we are considering at this stage.

If we are going to need complex rules, like full boolean expressions for tag derivation, they can be applied when building the tag set, before the work request is in PENDING state. Once the work request is PENDING, the sets of task-provided and task-required tags are final and cannot be changed anymore. This makes space for complex use cases while keeping complexity away from the scheduler. This also allows to build database indexes of tagsets specific to work requests in PENDING state, for use by the scheduler.

At least the final task-provided tag set for a task will be preserved for the rest of the lifetime of the task, and possibly the task-required tags as well. Besides helping with debugging, task tags are likely to be useful also for other use cases, such as browsing tasks in the UI.

Use cases

Replace BaseTask.can_run_on

BaseTask.can_run_on currently needs to execute code on every potential task/worker match, and makes the scheduler inefficient.

Its current implementations ensure that:

  • a task is executed only in a worker that can execute that task type, name and version;

  • the task build architecture, if present, is supported by the worker’s system architectures

  • a worker has an executor backend of the type a task may require

  • a worker has a command installed that the task needs (autopkgtest, mmdebstrap, sbuild)

Potential tag-based solution:

Have worker-provided tags corresponding to significant system capabilities (for example: type of worker, catalog of executable tasks, relevant installed packages, available executor backends, supported build architectures).

Tasks can then analyze their requirements when building dynamic task data, and build a list of matching task-required tags.

This effectively moves what can_run_on is doing from the scheduling phase to the earlier dynamic task data computation phase.

Security boundaries:

There are no significant security issues in this use case: a wrong scheduling decision would produce a task that cannot be executed and ends with an error.

Some packages are only built on “large” builders

Packages like the linux kernel or libreoffice may be too large to be run by any worker, so we need to be able to limit the candidate workers of “large packages” to “large workers”.

Potential tag-based solution:

Use worker-provided tags (for example: worker:class) to classify selected workers with worker:class:large.

Use the task configuration system to add a worker:class:large task-required tag.

Alternatively, if the source package name gets encoded in a task-provided tag (for example: task:source-package), a tag inference rule could add the worker:class:large task-required tag for all tasks that have task:source-package:libreoffice or task:source-package:linux.

Security boundaries:

There are no significant security issues in this use case: a wrong scheduling decision would produce a task that fails due to limited resources and ends with an error.

Official Debian packages need to be built on official Debian workers

Not all builds in the Debian scope are targeted at Debian.

Example of tasks building packages targeted at Debian:

  • Debian uploads

  • Uploads to workspaces that are staging repositories for transitions aimed at Debian

  • Generating build environments for Debian uploads

Examples of tasks building packages not targeted at Debian:

  • Uploads to workspaces representing personal repositories

  • Uploads to workspaces representing experiments

Experiment tasks could run on any workers, including dynamically provisioned cloud ones, while uploads targeted at Debian need to be built on official Debian workers for their binaries to be accepted into the distribution without the need for a rebuild.

Potential tag-based solution:

This can be handled by official Debian workers being configured by the administrator to provide a worker:class tag identifying them as such. Tasks that are intended to target Debian can then require that tag.

The tag could be set either by the workflow, if the intention is to be set on a task-by-task basis, or by the workspace via the task configuration mechanism, if the intention is to be set for an entire workspace.

Security boundaries:

This kind of task-directed worker selection does not have security issues, provided that the next use case is in place.

If a task is wrongly tagged as requiring an official Debian worker, but it has been created by a user that cannot use official Debian workers, the worst that can happen is that it will never get scheduled.

Note

A possible way of tracking that an artifact has been built on an official Debian worker is part of #404.

Official Debian workers only build packages uploaded by Debian developers or maintainers

Debian is a complex ecosystem, and people who are not Debian Developers or Debian Maintainers may eventually be allowed into the Debian scope, for example to support package sponsorship, mentoring, or extending personal repositories to non-DD and non-DM people.

This could however open the risk of arbitrary code execution by non-vetted people on workers. We could have workers set up where we design for that risk, but we need to protect official Debian workers from a potential vector of supply-chain attacks.

Potential tag-based solution:

When building the task-provided tag set, Debusine can enumerate the groups that have the user in created_by as a member, and add a task:group tag for each of them.

Official Debian workers can then require tasks to have the task:group:debian::Debian tag, to ensure they are running code from Debian Developers.

If however Debian workers can execute code from both Debian Developers and Debian Maintainers, this is not sufficient, as workers would need to be restricted to tasks having either task:group:debian::Debian or task:group:debian::Maintainers.

That can be addressed with a task:class tag like debian-official, that is assigned to users who are members of either the Debian or the Maintainers group.

Security boundaries:

This use case has security issues: being able to arbitrarily provide the task tags this relies on, would allow untrusted users to gain arbitrary code execution on trusted Debian workers.

For a solution that relies on task:group tags, when building the final task-provided tag set we can ensure that the task:group tags are only provided by a database query.

For a solution that relies on task:class tags, we need to have a way that restricts providing the relevant task class only to members of the given groups, or a way for scope admins to set up scope-specific tag inference rules. Those would require groups or scopes to be encoded as provenances, to be able to configure the restriction.

For this blueprint we are relying on a vetted user not running builds with untrusted code. For example, we are not providing a way to defend against an official developer scheduling a build using untrusted extra artifacts. There would be a case of “official Debian workers only build using artifacts built by official Debian workers”, but it would be a case for a different design iteration, more related to SBOM tracking than scheduler design.

Have workers that are dedicated to specific scopes / workspaces

In a SaaS situation, we may need to allocate a pool of workers to a given scope, in a way that tasks in that scope are only run in that dedicated set of workers, and that dedicated set of workers only runs tasks for that scope.

This would allow to respect things like confidentiality requirements and service level agreements.

Potential tag-based solution:

To make sure that dedicated workers execute only the tasks of their assigned scopes:

  • As long as partitioning works along one, and only one scope, tasks can provide a task:scope tag, and workers can be configured to require it.

  • If a worker could be dedicated to multiple scopes, for example in case Freexian wanted to dedicate a single pool of workers to all SaaS customers of a given tier, then we can use an administrator-defined classification system (see task:class), classify tasks by tier name in all the relevant SaaS scopes, and have workers require that class.

To make sure tasks are only executed in their dedicated workers, workers can be classified by the admin (for example with worker:class) as dedicated to that partition. All workers in the relevant scopes can then require the worker to provide that class tag.

Security boundaries:

This case has security issues on both task-provided and worker-provided tags.

If a task can provide the tag identifying it as an arbitrary scope or class, then it can execute arbitrary code on a restricted worker.

If a worker can wrongly provide the tag identifying it as a dedicated worker for a given scope or class, then it can get assigned tasks that it should not see, breaking confidentiality agreements.

In the case of task-provided scope tags, we can enforce, when building the final task-provided tag set, that the task:scope tag matches the current scope.

In the case of arbitrary task class tags (like task:class), we need a way to configure specific tag values to be usable only in a restricted set of scopes.

Building tag sets for scheduling

Building tag sets for scheduling is an incremental operation, where we start with empty tag sets and different sources contribute tags to add to them.

4 tag sets need to be built for each scheduling operation:

  • task-provided tags

  • task-required tags

  • worker-provided tags

  • worker-required tags

In most cases the final tag set is the simple union of the sets of tags provided by all relevant data sources.

For tag sets that may contain tags that are sensitive for security, there needs to be some level of validation before a tag is allowed to be added to the set.

To support more complex use cases, a set of tag derivation functions may be applied to tag sets before finalization, where some tags are added depending on boolean expressions evaluated on the presence of other tags.

Tag merging, validation and derivation

The backend for building a tag set can be a system that:

  • starts with an empty tag set

  • provides a way for merging another tag set, specifying its provenance

  • provides a way to finalize the tag set, possibly applying tag derivation expressions

The system should be configurable, possibly via the Django settings system, with:

  • an allow list of provenance restrictions associated to a list of tags or tag prefix matching expressions, specifying that any such tags will be removed from all tag sets that come from a provenance not in the allow list

  • a set of boolean expressions and associated tag sets, to derive extra tags during finalization

When a task is about to move to the PENDING state, or when a worker updates its dynamic metadata, the system will gather the relevant merge candidate tag sets, the relevant backends are instantiated, and it is used to build the final tag set used for scheduling.

It makes sense for Debusine to not start with an empty configuration, but to have a built-in hardcoded set of initial restrictions that cover the system-provided tags, like restricting task:group:* tags to come only from the system provenance.

Workers providing tags

Possible tag provenances:

  • worker, for tags provided in dynamic worker metadata when a worker registers as active

  • admin, for tags provided in static worker metadata by the Debusine admins

  • system, from database queries

Workers provide a list of tags in the metadata sent at connection time, which are computed by analyze_worker code on the worker side and stored in the database in the Worker.dynamic_metadata JSONField.

Other tags can be specified in the Worker.static_metadata JSONField, to override tag information with admin-provided information.

For workers in worker pools, we will need to add a way, for admins who set up worker pools, to specify static worker metadata for all new workers in the pool, therefore establishing a way for admin tags to be provided in the worker pool configuration.

We assume that we can trust the admins who set up the workers to provide valid static_data.

We cannot assume that dynamic metadata provided by the worker can be trusted, as workers may get compromised.

Workers requiring tags

Possible tag provenances:

  • admin, for tags provided in static worker metadata by the Debusine admins

There are currently no known use cases to have worker as a provenance for worker-required tags.

Workers can require tags using the same dynamic/static metadata system. In the example use case of workers dedicated to specific scopes or workspaces, worker-required tags are likely to be configured in the static data only.

We can consider extending worker pool so that the worker pool configuration can specify tags to worker-provide and worker-require, to be merged with dynamic and static worker metadata, so that it’s possible, for example, to set up a worker pool dedicated to a single workspace. Such tags will probably be stored in static worker metadata, and contribute to the admin tag set.

Requiring tags is not security sensitive, so the set of worker-required tags can be built simply as the set union of the tag sets provided by worker-related tag sources. In most cases, however, worker-required tags are going to be provided only by admin-provided static worker metadata.

Tasks providing tags

Possible tag provenances:

  • user, work task/workspace input in task_data

  • system, from database queries

  • workspace, from the task configuration system

  • scope:… or group:…, if we are going to decide to allow scopes or groups to provide tags or tag inferences for task-provided tags (see this use case)

The final set of tags provided by a task is stored in their dynamic_metadata JSONField.

For some use cases, we cannot assume that all user-defined task-provided tags can be trusted, and for some use cases that includes tags provided by the workspace task configuration.

It is possible that some use cases could benefit from being able to configure workflow templates to provide tags. Given that workflow templates are configured by the workspace admin, they would need to be tracked with a workspace provenance. Tags provided from workspace runtime parameters would be tracked with a user provenance.

Tasks requiring tags

Possible tag provenances:

  • user, work task/workspace input in task_data

  • system, from database queries

  • workspace, from the task configuration system

The final set of tags required by a task is stored in their dynamic_metadata JSONField.

Tasks can list required tags in their dynamic_metadata information, which can be computed from task information in the database (like task type, task name, task version), task data (like executor information, or arbitrary tag requirements from users), and task configuration (like specific worker requirements).

Adding required tags is not a security sensitive operation, so as long as only adding tags is supported, and removing tags is not, tag requirements from all the sources listed above can be a simple set merge operation.

Draft tag ontologies

This chapter collects the draft tag ontologies that have been mentioned in the blueprint when covering each different use case.

Worker-provided tag ontologies

worker:system

Describe the worker capabilities.

  • worker:system:worker_type:{worker_type}: type of worker, encoding the type of tasks that this worker can execute. This is provided by the database field Worker.worker_type and cannot be overridden.

It could be that this ontology could be merged into worker:capability.

worker:worker, worker:server, worker:signing

Describe the tasks that the workers can execute:

  • worker:{task_type}:{task_name}:version:{version}: the worker can execute the task with the given type, name and version. This is provided by dynamic worker metadata and cannot be overridden.

worker:capability

Describe relevant capabilities available in the worker.

  • worker:capability:{name}: the worker has the named capability.

name could be the name of an installed package, the name of a suite of packages providing a significant feature, or the name of an available hardware feature.

Examples of relevant packages: autopkgtest, mmdebstrap, sbuild, qemu, dev-kvm, gpu, and so on.

worker:executor

Describe the executor backends available in the worker.

  • worker:executor:{name}: the given executor is available for use in the worker.

name can take values from the debusine.tasks.models.BackendType enum except for auto.

Example executor names: unshare, incus-lxc, incus-vm and qemu.

worker:build-arch

Describe the build architectures supported in the worker.

  • worker:build-arch:{name}: the worker supports the given build architecture.

name is an architecture name as used in Debian, like amd64, arm64, ppc64el, and so on.

worker:class

Assign one or more arbitrary classes to workers.

  • worker:class:{name}: the worker has been assigned the given class.

name is a classification system arbitrarily defined by the Debusine administrators.

Possible examples: small, large, trusted, official, experimental, and so on.

Task-provided tag ontologies

task:scope

Describe the scope that contains the task.

  • task:scope:{scope_name}: the task is in the given scope.

task:workspace

Describe the workspace that contains the task.

  • task:workspace:{scope_name}:{workspace_name}: the task is in the given workspace.

task:group

Declare that the user who created the task is a member of a group.

  • task:group:{scope_name}:{workspace_name}:{group_name}: the user is a member of this group.

scope_name and group_name correspond to the scope and name of the group.

Given that a group is only optionally assigned to a workspace, workspace_name corresponds to the workspace name of the group, when present, or to the empty string otherwise. A tag for a non-workspaced group could then be, for example: task:group:debian::Debian.

task:source-package

Name of the source package for this task.

  • task:source-package:{name}: name of the source package for the task.

name corresponds to a source package name.

task:class

Assign one or more arbitrary classes to tasks.

  • task:class:{name}: the task has been assigned the given class.

name is a classification system arbitrarily defined by the Debusine administrators.

Possible examples: embargoed, security, urgent, official, experimental, trusted, untrusted, and so on.

Postponed elements

Generate tags from task statistics

We can defer generating tags from task statistics to another design iteration, as using task configuration to require tags should be enough to deal with immediate use cases.

This said, it is possible to consider generating lists of dynamically provided tags based on task execution statistics, to be merged into worker-provided or task-required tag sets.

Supersede subject and runtime context

Task tags also have potential to be used to supersede the subject and runtime context task metadata, and as such in matching tasks for task configuration.

Scheduling preferences

Tags could also be used to express scheduling preferences, for example to allow “small tasks” to prefer “small workers”, to keep large workers available for bigger tasks while at the same time using large workers for small tasks when all other workers are full.

The current scope of this blueprint is how to find suitable workers, and scheduling preferences would be a problem of ranking suitable workers, which looks like a different problem to design for.

It could very well be that the ranking problem could be solved by introducing a set of task-preferred tags, and scheduling could sort available and suitable workers by the decreasing number of task-preferred tags that they have.

It could also be that, by the time we get to the need of ranking suitable workers, we gathered enough experience with task execution statistics and more elaborate scheduling corner cases to come up with entirely different ideas.

Determining demand for worker pool provisioning

Tags may be used to determine suitability of using a worker pool to scale capacity to process a growing list of pending work requests.

One way to do that, is to have the worker pool contain a preview of the full worker-provided tag list of the workers it can create, so it would be possible to check how many pending tasks could be handled by the worker pool.

However, to keep the scope of this blueprint contained, this would be a discussion to be handled in a further design iteration.