Workflow for use by Debian CI

We want to allow ci.debian.net to use Debusine as a backend. Our existing Autopkgtest task and autopkgtest workflow can do most of the hard work, but will need some changes.

Overall model

ci.debian.net should have its own workspace and its own debian:qa-results collections. The ci.debian.net operators should have the OWNER role on this workspace, while bot accounts should have the CONTRIBUTOR role. This workspace can have suitable expiration policies, retaining logs for a few months and basic result history indefinitely (although ci.debian.net will keep its own copies of the history as well, at least to begin with).

debci will start workflows as needed, wait for them to complete, and collect the results. Until we have outbound webhooks to notify debci of workflow completion, it should simply poll occasionally for the status of workflows whose results it has not yet collected.

In future, it might be possible to have britney dispatch requests to Debusine directly, and nothing in this design should make that more difficult. However, debci has quite a bit of intelligence of its own as well as the ci.debian.net frontend, so we recommend that it be kept in the loop at least for the first iteration of this work.

Looking up packages

We can assume that Debusine’s mirrored suites are reasonably up to date, and use lookups into them. This won’t be quite perfect, especially for packages that have been uploaded very recently; but the integration between britney and ci.debian.net already knows which version of a package was tested and will repeat tests until it gets a result from the desired version, so we can tolerate some slack here. debci can also choose to use lookups with versions so that workflows will fail immediately if the desired version is unavailable.

Being able to test lookups from the client side (#691) could help with this.

Pinning packages

ci.debian.net currently uses the autopkgtest --pin-packages option. For example, a test run triggered by changes to source packages foo and bar from unstable would use --pin-packages=unstable=src:foo,src:bar. However, as noted in the Autopkgtest task, we have deliberately not added support for that until now due to concerns about it not being explicit enough.

We could add support for this anyway and tolerate the unexpected behaviour. However, with a few small changes, it should be possible to get nearly-equivalent results using something similar to this:

context_artifacts:
    - collection: sid@debian:suite
      category: debian:binary-package
      error_on_empty: true
      data__srcpkg_name: "foo"
      data__srcpkg_version: "1.0-1"
      data__architecture__in: [amd64, all]
    - collection: sid@debian:suite
      category: debian:binary-package
      error_on_empty: true
      data__srcpkg_name: "bar"
      data__srcpkg_version: "2.0-2"
      data__architecture__in: [amd64, all]

This requires adding error_on_empty: true to dictionary lookups to make it be an error if the lookup returns no results, and accepting the __in suffix in lookup matchers.

When used, this would be more precise: Debusine would know exactly which binary artifacts were requested, automatically recording them in the dynamic task data. The required extensions seem worthwhile in general.

britney does not currently pass the source version of pinned packages in its request to debci (unless it were to parse the trigger field, which seems undesirable). We could change that protocol, or we could omit data__srcpkg_version from the example lookups above, which would at least still record the binary artifacts in dynamic task data.

Unfortunately, this is still not quite enough. autopkgtest(1) has this note about --pin-packages:

Attention: This does not currently resolve some situations where dependencies of the given packages can only be resolved in RELEASE. In this case the apt pinning will be removed and package installation will be retried with the entirety of RELEASE, unless --no-apt-fallback is specified.

The Debusine task cannot handle this itself, unless we were to add something complicated like preemptive dependency resolution or parsing the output of autopkgtest for failures to resolve dependencies. As a result, while we can still use context_artifacts to increase precision, there seems no alternative but to add pin_packages parameters to the Autopkgtest task and the autopkgtest workflow, with notes about their limitations. It does appear to be possible to pass --pin-packages=unstable= (without a package list) to autopkgtest along with additional binaries, which should have the same fallback behaviour.

Extra APT sources and signing keys

The extra_repositories option (defined in the same way as in the generic PackageBuild task) should already be good enough for this.

APT retries

debci uses autopkgtest --setup-commands to set Acquire::Retries "10"; in APT’s configuration. This seems like a good idea for robustness, but there’s no need for this to be done at the task level; we can simply use customization_script to adjust the environments we use. This could be done specifically for autopkgtest, but we might as well do it across the board.

Backend selection

incus-lxc should be suitable for most packages. Task configuration, maintained by the owners of the relevant workspace, can be used to configure some packages to use incus-vm instead.

Regression tracking

Debusine supports tracking and analyzing regressions, controlled by various options to QA-related workflows, particularly update_qa_results and enable_regression_tracking. britney already does similar work, and the Debian release team doesn’t want to outsource that. However, debci needs to be able to schedule reference test runs, and it makes sense to use parts of the same mechanism to do so even without the regression analysis step. This should be done using the following parameters:

qa_suite: forky@debian:suite
reference_qa_results: forky@debian:qa-results
update_qa_results: true

This will skip running the task if the reference debian:qa-results collection already contains an analysis from the last 30 days with a version matching the one published in qa_suite. However, reference runs sometimes break for other reasons such as external data or the passage of time (e.g. expired certificates), so debci needs to be able to force an update. This will be done by changing the definition of the update_qa_results parameter to autopkgtest as follows:

update_qa_results (string, defaults to no): whether to update reference QA results. Allowed values are no, yes, and force. When set to yes, the workflow runs QA tasks and updates the collection passed in reference_qa_results with the results, unless that collection already contains a current matching result. force is like yes, but does not check whether the collection already contains a current matching result.

For compatibility, boolean False is equivalent to no, and boolean True is equivalent to yes.

Between them, britney/scripts/debci-put.py and debci need to arrange for the above parameters to be passed to Debusine when running migration-reference/0 tests. debci will then collect the results and store them in the same kind of way that it collects all other results from Debusine.