Tasks

Ontology for generic tasks

While tasks are unique in theory, we can have different tasks sharing some commonalities. In the Debian context in particular, we have different ways to build Debian packages with different helper programs (sbuild, pbuilder, etc.) and we want those tasks to reuse the same set of parameters so that they can be called interchangeably.

This public interface is materialized by a generic task that can be scheduled by the users and that will run one of the available implementations that can run on one of the available workers.

This section documents those generic tasks and their interface.

There are some task_data keys that apply to all tasks:

  • notifications (optional): a dictionary containing:

    • on_failure (required): a specification of what to do if the task fails, formatted as an array of dictionaries as follows:

      • channel (required): the NotificationChannel to use for this notification

      • data (optional): a dictionary as follows (for email channels; this may change for other notification methods):

        • from (optional): the email address to send this notification from (defaults to the channel’s from property)

        • to (optional): a list of email addresses to send this notification to (defaults to the channel’s to property)

        • cc (optional): a list of email addresses to CC this notification to (defaults to the channel’s cc property, if any)

        • subject (optional): the subject line for this notification (defaults to the channel’s subject property, or failing that to WorkRequest $work_request_id completed in $work_request_result); the strings ${work_request_id} and ${work_request_result} (or $work_request_id and $work_request_result, provided that they are not followed by valid identifier characters) are replaced by their values

Task data key names are used in pydantic models, and must therefore be syntactically valid Python identifiers (although they may collide with keywords, in which case pydantic aliases should be used).

Task PackageBuild

A generic task to represent a package build, i.e. the act of transforming a source package (.dsc) into binary packages (.deb).

The task_data associated to this task can contain the following keys:

  • input (required): a dictionary describing the input data

    • source_artifact_id (required): source_artifact_id pointing to a source package, it is used to retrieve the source package to build.

    • extra_binary_artifact_ids: (optional). List of artifact IDs. If provided these binary package artifacts (debian:binary-package or debian:binary-packages) are downloaded and made available to apt when installing build-dependencies.

  • distribution (required if backend is schroot): name of the target

    distribution.

  • environment (required if backend is not schroot): ID of an artifact of category debian:system-tarball or debian:system-image, depending on the backend type. QEMU and INCUS_VM require a debian:system-image artifact, while the other backends require a debian:system-tarball.

  • backend (optional, defaults to unshare): If auto, the task uses the default. Supported backends: incus-lxc, incus-vm, qemu, schroot, and unshare.

  • extra_repositories (optional): a list of extra repositories to enable. Each repository is described by a dictionary with the following possible keys:

    • sources_list: a single-line for an APT’s sources.list file

    • authentication_key (optional): the ascii-armored public key used to authenticate the repository

  • host_architecture (required): the architecture that we want to build for, it defines the architecture of the resulting architecture-specific .deb (if any)

  • build_architecture (optional, defaults to the host architecture): the architecture on which we want to build the package (implies cross-compilation if different from the host architecture). Can be explicitly set to the undefined value (Python’s None or JavaScript’s null) if we want to allow cross-compilation with any build architecture.

  • build_components (optional, defaults to any): list that can contain the following 3 words (cf dpkg-buildpackage --build=any,all,source):

    • any: enables build of architecture-specific .deb

    • all: enables build of architecture-independent .deb

    • source: enables build of the source package (.dsc)

  • build_profiles: list of build profiles to enable during package build (cf dpkg-buildpackage --build-profiles)

  • build_options: value of DEB_BUILD_OPTIONS during build

  • build_path (optional, default unset): forces the build to happen through a path named according to the passed value. When this value is not set, there’s no restriction on the name of the path.

Task SystemBootstrap

A generic task to represent the bootstrapping of a Debian system out of an APT repository. The end result of such a task is to generate an artifact of category debian:system-tarball.

The task_data associated to this task can contain the following keys:

  • bootstrap_options: a dictionary with a few global options:

    • variant (optional): maps to the --variant command line option of debootstrap

    • extra_packages (optional): list of extra packages to include in the bootstrapped system

    • architecture (required): the native architecture of the built Debian system. The task will be scheduled on a system of that architecture.

  • bootstrap_repositories: a list of repositories used to bootstrap the Debian system. Note that not all implementations might support multiple repositories.

    • types (optional): a list of source types to enable among deb (binary repository) and deb-src (source repository). Defaults to a list with deb only.

    • mirror (required): the base URL of a mirror containing APT repositories in $mirror/dists/$suite

    • suite (required): name of the distribution’s repository to use for the bootstrap

    • components (optional): list of components to use in the APT repository (e.g. main, contrib, non-free, …). Defaults to download the Release from the suite and using all the Components.

    • check_signature_with (optional, defaults to system): indicates whether we want to check the repository signature with the system-wide keyrings (system), or with the external keyring documented in the in the keyring key (value external), or whether we don’t want to check it at all (value no-check).

    • keyring_package (optional): install an extra keyring package in the bootstrapped system

    • keyring (optional): provide an external keyring for the bootstrap

      • url (required): URL of the external keyring to download

      • sha256sum (optional): SHA256 checksum of the keyring to validate the downloaded file

      • install (boolean, defaults to False): if True, the downloaded keyring is installed and used in the target system.

  • customization_script (optional): a script that is copied in the target chroot, executed from inside the chroot and then removed. It lets you perform arbitrary customizations to the generated system. You can use apt to install extra packages. If you want to use something more elaborated than a shell script, you need to make sure to install the appropriate interpreter during the bootstrap phase with the extra_packages key.

Task SystemImageBuild

This generic task is an extension of the SystemBootstrap generic task: it should generate a disk image artifact complying with the debian:system-image definition. That disk image contains a Debian-based system matching the description provided by the SystemBootstrap interface.

The following additional keys are supported:

  • disk_image

    • format (required): desired format for the disk image. Supported values are raw and qcow2.

    • filename (optional): base of the generated disk image filename.

    • kernel_package (optional): name of the kernel package to install, the default value is linux-image-generic, which is only available on Bullseye and later, on some architectures.

    • bootloader (optional): name of the bootloader package to use, the default value is systemd-boot on architectures that support it.

    • partitions (required): a list of partitions, each represented by a dictionary with the following keys:

      • size (required): size of the partition in gigabytes

      • filesystem (required): filesystem used in the partition, can be none for no filesystem, swap for a swap partition, or freespace for free space that doesn’t result in any partition (it will thus just offset the position of the following partitions).

      • mountpoint (optional, defaults to none): mountpoint of the partition in the target system, can be none for a partition that doesn’t get a mountpoint.

Specifications of tasks

This section lists all the available tasks, with the input that they are accepting, the description of what they are doing, including the artifacts that they are generating.

The tasks listed in this section are those that you can use to submit work requests.

Autopkgtest task

The task_data associated to this task can contain the following keys:

  • input (required): a dictionary describing the input data:

    • source_artifact_id (required): the debian:source-package or debian:upload artifact ID representing the source package to be tested with autopkgtest

    • binary_artifacts_ids (required): a list of debian:binary-packages or debian:upload artifact IDs representing the binary packages to be tested with autopkgtest (they are expected to be part of the same source package as the one identified with source_artifact_id)

    • context_artifacts_ids (optional): a list of debian:binary-packages or debian:upload artifact IDs representing a special context for the tests. This is used to trigger autopkgtests of reverse dependencies, where context_artifacts_ids is set to the artifacts of the updated package whose reverse dependencies are tested, and source/binary artifact IDs are one of the reverse dependency whose autopkgtests will be executed.

  • host_architecture (required): the Debian architecture that will be used in the chroot or VM where tests are going to be run. The packages submitted in input:binary_artifacts_ids usually have a matching architecture (but need not in the case of cross-architecture package testing, eg. testing i386 packages in an amd64 system).

  • environment (Single lookup with default category debian:environments, required): debian:system-tarball or debian:system-image artifact (as appropriate for the selected backend) that will be used to run the tests.

  • backend (optional): the virtualization backend to use, defaults to auto where the task is free to use the most suitable backend. Supported: incus-lxc, incus-vm, qemu, and unshare.

  • include_tests (optional): a list of the tests that will be executed. If not provided (or empty), defaults to all tests being executed. Translates into --test-name=TEST command line options.

  • exclude_tests (optional): a list of tests that will skipped. If not provided (or empty), then no tests are skipped. Translates into the --skip-test=TEST command line options.

  • debug_level (optional, defaults to 0): a debug level between 0 and 3. Translates into -d up to -ddd command line options.

  • extra_apt_sources (optional): a list of APT sources. Each APT source is described by a single line (deb http://MIRROR CODENAME COMPONENT) that is copied to a file in /etc/apt/sources.list.d. Translates into --add-apt-source command line options.

  • use_packages_from_base_repository (optional, defaults to False): if True, then we pass --apt-default-release=$DISTRIBUTION with the name of the base distribution given in the distribution key.

  • extra_environment (optional): a dictionary listing environment variables to inject in the build and test environment. Translates into (multiple) --env=VAR=VALUE command line options.

  • needs_internet (optional, defaults to “run”): Translates directly into the --needs-internet command line option. Allowed values are “run”, “try” and “skip”.

  • fail_on (optional): indicates whether the work request must be marked as failed in different scenario identified by the following sub-keys:

    • failed_test (optional, defaults to true): at least one test has failed (and the test was not marked as flaky).

    • flaky_test (optional, defaults to false): at least one flaky test has failed.

    • skipped_test (optional, defaults to false): at least one test has been skipped.

  • timeout (optional): a dictionary where each key/value pair maps to the corresponding --timeout-KEY=VALUE command line option with the exception of the global key that maps to --timeout=VALUE. Supported keys are global, factor, short, install, test, copy and build.

Note

At this point, we have voluntarily not added any key for the --pin-packages option because that option is not explicit enough: differences between the mirror used to schedule jobs and the mirror used by the jobs result in tests that are not testing the version that we want. At this point, we believe it’s better to submit all modified packages explicitly via input:context_artifacts_ids so that we are sure of the .deb that we are submitting and testing with. That way we can even test reverse dependencies before the modified package is available in any repository.

This assumes that we can submit arbitrary .deb on the command line and that they are effectively used as part of the package setup.

autopkgtest is always run with the options --apt-upgrade --output-dir=ARTIFACT-DIR --summary=ARTIFACT-DIR/summary --no-built-binaries.

An artifact of category debian:autopkgtest is generated to store all output files, and is described in the artifacts reference.

Mmdebstrap task

The mmdebstrap task fully implements the SystemBootstrap interface.

On top of the keys defined in that interface, it also supports the following additional keys in task_data:

  • bootstrap_options

    • use_signed_by (defaults to True): if set to False, then we do not pass the keyrings to APT via the Signed-By sources.list option, instead we rely on the --keyring command line parameter.

The keys from bootstrap_options are mapped to command line options:

  • variant maps to --variant (and it supports more values than debootstrap, see its manual page)

  • extra_packages maps to --include

The keys from bootstrap_repositories are used to build a sources.list file that is then fed to mmdebstrap as input.

SimpleSystemImageBuild task

The simplesystemimagebuild task implements the SystemImageBuild interface except that it expects a single entry in the list of partitions: the entry for the root filesystem (thus with a mountpoint of /).

In terms of compliance with the SystemBootstrap interface, the bootstrap phase only uses a single repository but the remaining repositories are enabled after the bootstrap.

This task is implemented with the help of the debos tool.

Lintian task

The task_data associated to this task can contain the following keys:

  • input (required): a dictionary of values describing the input data, one of the sub-keys is required but both can be given at the same time too.

    • source_artifact_id (optional): the debian:source-package or debian:upload artifact ID representing the source package to be tested with lintian

    • binary_artifacts_ids (optional): a list of debian:binary-package, debian:binary-packages, or debian:upload artifact IDs representing the binary packages to be tested with lintian (they are expected to be part of the same source package as the one identified with source_artifact_id)

Note

While it’s possible to submit only a source or only a single binary artifact, you should aim to always submit source + arch-all + arch-any related artifacts to have the best test coverage as some tags can only be emitted when lintian has access to all of them at the same time.

  • environment (Single lookup with default category debian:environments, required): debian:system-tarball artifact that will be used to run lintian. Must have lintian installed.

  • backend (optional): the virtualization backend to use, defaults to auto where the task is free to use the most suitable backend. Supported options: incus-lxc, incus-vm, unshare.

  • output (optional): a dictionary of values controlling some aspects of the generated artifacts

    • source_analysis (optional, defaults to True): indicates whether we want to generate the debian:lintian artifact for the source package

    • binary_all_analysis (optional, defaults to True): same as source_analysis but for the debian:lintian artifact related to Architecture: all packages

    • binary_any_analysis (optional, defaults to True): same as source_analysis but for the debian:lintian artifact related to Architecture: any packages

  • target_distribution (optional): the fully qualified name of the distribution that will provide the lintian software to analyze the packages. Defaults to debian:unstable.

  • include_tags (optional): a list of the lintian tags that are allowed to be reported. If not provided (or empty), defaults to all. Translates into the --tags or --tags-from file command line option.

  • exclude_tags (optional): a list of the lintian tags that are not allowed to be reported. If not provided (or empty), then no tags are hidden. Translates into the --suppress-tags or --suppress-tags-from file command line option.

  • fail_on_severity (optional, defaults to none): if the analysis emits tags of that severity or higher, then the task will return a “failure” instead of a “success”. Valid values are (in decreasing severity) “error”, “warning”, “info”, “pedantic”, “experimental”, “overridden”. “none” is a special value indicating that we should never fail.

The lintian runs will always use the options --display-level ">=classification" (>=pedantic in jessie) --no-cfg --display-experimental --info --show-overrides to collect the full set of data that lintian can provide.

Note

Current lintian can generate “masked” tags (with M: prefix) when you use --show-overrides. For the purpose of debusine, we entirely ignore those tags on the basis that it’s lintian’s decision to hide them (and not the maintainer’s decision) and as such, they don’t bring any useful information. Lintian is full of exceptions to not emit some tags and the fact that some tags rely on a modular exception mechanism that can be diverted to generate masked tags is not useful to package maintainers.

For those reasons, we suggested to lintian’s maintainers to entirely stop emitting those tags in https://bugs.debian.org/1053892

Between 1 to 3 artifacts of category debian:lintian will be generated (one for each source/binary package artifact submitted) and they will have a “relates to” relationship with the corresponding artifact that has been analyzed. The debian:lintian artifacts are described in the artifacts reference.

Sbuild task

Regarding inputs, the sbuild task is compatible with the ontology defined for Task PackageBuild even though it implements only a subset of the possible options at this time.

Currently unsupported PackageBuild task keys:

  • extra_repositories

  • build_architecture / build_profiles

  • build_options

  • build_path

Output artifacts and relationships:

  1. debian:package-build-log: sbuild output

    • relates-to: source_artifact_id

    • relates-to: b

  2. debian:binary-packages: the binary packages (*.deb) built from the source package

    • relates-to: source_artifact_id

  3. debian:upload: b plus the right administrative files (.changes, .buildinfo) necessary for its binary upload

    • extends: b

    • relates-to: b

  4. debusine:work-request-debug-logs: debusine-specific worker logs

    • relates-to: source_artifact_id

Piuparts task

A specific task to represent a binary package check using the piuparts utility.

The task_data associated to this task can contain the following keys:

  • input (required): a dictionary describing the input data

    • binary_artifacts_ids (required): a list of debian:binary-packages or debian:upload artifact IDs representing the binary packages to be tested. Multiple Artifacts can be provided so as to support e.g. testing binary packages from split indep/arch builds.

  • backend (optional, defaults to unshare). If auto, the task uses the default. Supported backends: incus-lxc, incus-vm, schroot, and unshare.

  • environment (Single lookup with default category debian:environments, required): artifact of category debian:system-tarball that will be used to run piuparts itself.

  • base_tgz_id (required): ID of an artifact of category debian:system-tarball that will be used to run piuparts tests, through piuparts --base-tgz. If the artifact’s data has with_dev: True, the task will remove the files /dev/* before using it.

  • host_architecture (required): the architecture that we want to test on.

The piuparts output will be provided as a new artifact.

Blhc task

A task to represent a build log check using the blhc utility.

The task_data associated to this task can contain the following keys:

  • input (required): a dictionary describing the input data

    • artifact (Single lookup, required): a debian:package-build-log artifact corresponding to the build log to be checked. The file should have a .build suffix.

  • extra_flags (optional): a list of flags to be passed to the blhc command, such as --bindnow or --pie. If an unsupported flag is passed then the request will fail.

The blhc output will be provided as a new artifact of category debian:blhc, described in the artifacts reference.

The task returns success if `blhc` returns an exit code of 0 or 1, and failure otherwise.

UpdateDerivedCollection task

This is a generic server-side task that compares two collections, one of which is derived from the other, and creates any work requests necessary to update the derived collection.

The task_data for this task may contain the following keys:

  • base_collection_id (required): the ID of the “base” collection which we are using as a source of data

  • derived_collection_id (required): the ID of the “derived” collection that we are updating

  • child_task_data (optional): a dictionary to use as the task_data of child work requests, with additional items merged into it as indicated by the specific implementation; for example, it may be useful to specify an environment here

  • force (boolean, defaults to False): if True, schedule work requests for each matching artifact in the base collection regardless of whether there is already a corresponding artifact in the derived collection (for example, this might be useful when the implementation of the task has changed)

Specific tasks based on this interface are responsible for determining the relevant subsets of active items in each of the base and derived collections that are compared, for defining the desired derived item names given a set of base items, and for defining the work requests needed to perform each individual update to the derived collection.

This task takes the relevant subset of the derived collection and finds the items in the base collection from which each of them were derived, using derived_from in each of the per-item data fields. (Multiple items may be derived from the same base items.) It then compares these items to the relevant subset of the base collection and determines the derived items that need to be changed given the current contents of the base collection, in one of these ways:

  • add: a derived item is desired but does not exist

  • replace: a derived item is desired with the same name as one that already exists, but either its base items have changed or force is True

  • remove: a derived item exists but is not desired

The definition of a collection item’s name guarantees that only one active item with a given name may exist in any given collection, so it is a convenient key to use here.

For each derived item that should be added or replaced, the task creates suitable child work requests to create a new derived item and update the derived collection, according to the specific implementation. It only creates a work request if another work request with the same parameters does not already exist, or if force is True.

For each derived item that should be removed, the task immediately removes it from the derived collection.

If this task is part of a workflow, then each of the created work requests is created as a sub-step of it in the same workflow.

UpdateSuiteLintianCollection task

This task implements the UpdateDerivedCollection task interface, updating a derived debian:suite-lintian collection from a base debian-suite collection.

All active items in both collections are considered relevant.

Given a base debian:binary-package artifact with an architecture other than all, a derived item with the name {srcpkg_name}_{srcpkg_version}_{architecture} is desired.

Given a base debian:source-package artifact or for a base debian:binary-package artifact with Architecture: all, derived items with the names {srcpkg_name}_{srcpkg_version}_{architecture} are desired for the following values of architecture:

  • source (only for a source package artifact with no corresponding binary package artifacts)

  • all

  • each architecture where another base debian:binary-package artifact exists for the same source package name and version

The child work requests are for Lintian tasks, with the following task_data in addition to anything specified in this task’s child_task_data:

  • input:

    • source_artifact_id: the relevant debian:source-package artifact ID in the base collection

    • binary_artifacts_ids: a list of the relevant debian:binary-package artifact IDs in the base collection

Todo

We also need to specify how the child task is told to update the derived collection. This will probably depend on actions, as specified in !630.