Tasks
External API: input and output of tasks
Autopkgtest task (NOT IMPLEMENTED YET)
The task_data
associated to this task can contain the following keys:
input
(required): a dictionary describing the input data:source_arfifact_id
(required): the ID of the artifact representing the source package to be tested with autopkgtestbinary_artifacts_ids
(required): a list ofdebian:binary-packages
artifact IDs representing the binary packages to be tested with autopkgtest (they are expected to be part of the same source package as the one identified withsource_artifact_id
)context_artifacts_ids
(optional): a list ofdebian:binary-packages
artifact IDs representing a special context for the tests. This is used to trigger autopkgtests of reverse dependencies, wherecontext_artifacts_ids
is set to the artifacts of the updated package whose reverse dependencies are tested, and source/binary artifact IDs are one of the reverse dependency whose autopkgtests will be executed.
architecture
(required): the Debian architecture that will be used in the chroot or VM where tests are going to be run. The packages submitted ininput:binary_artifacts_ids
usually have a matching architecture (but need not in the case of cross-architecture package testing, eg. testing i386 packages in an amd64 system).distribution
(required): base distribution of the chroot/container/VM that will be used to run the tests. The distribution codename is prefixed by the vendor identifier.backend
(optional): the virtualization backend to use, defaults toauto
where the task is free to use the most suitable backend. Other valid options areschroot
,lxc
,qemu
,podman
.
Note
We are not supporting all backends as there’s a cost in supporting
the required infrastructure setup to make use of all the backends. For
schroot, we already have that due to the sbuild task. For podman
and qemu
we can do all the setup without root rights. For lxc
it’s the reference used by Debian and we want to support it too.
include_tests
(optional): a list of the tests that will be executed. If not provided (or empty), defaults to all tests being executed. Translates into--test-name=TEST
command line options.exclude_tests
(optional): a list of tests that will skipped. If not provided (or empty), then no tests are skipped. Translates into the--skip-test=TEST
command line options.debug_level
(optional, defaults to 0): a debug level between 0 and 3. Translates into-d
up to-ddd
command line options.extra_apt_sources
(optional): a list of APT sources. Each APT source is described by a single line (deb http://MIRROR CODENAME COMPONENT
) that is copied to a file in /etc/apt/sources.list.d. Translates into--add-apt-source
command line options.use_packages_from_base_repository
(optional, defaults to False): if True, then we pass--apt-default-release=$DISTRIBUTION
with the name of the base distribution given in thedistribution
key.environment
(optional): a dictionary listing environment variables to inject in the build and test environment. Translates into (multiple)--env=VAR=VALUE
command line options.needs_internet
(optional, defaults to “run”): Translates directly into the--needs-internet
command line option. Allowed values are “run”, “try” and “skip”.fail_on
(optional): indicates whether the work request must be marked as failed in different scenario identified by the following sub-keys:failed_test
(optional, defaults to true): at least one test has failed (and the test was not marked as flaky).flaky_test
(optional, defaults to false): at least one flaky test has failed.skipped_test
(optional, defaults to false): at least one test has been skipped.
timeout
(optional): a dictionary where each key/value pair maps to the corresponding--timeout-KEY=VALUE
command line option with the exception of theglobal
key that maps to--timeout=VALUE
. Supported keys areglobal
,factor
,short
,install
,test
,copy
andbuild
.
Note
At this point, we have voluntarily not added any key for the
--pin-packages
option because that option is not explicit enough:
differences between the mirror used to schedule jobs and the mirror
used by the jobs result in tests that are not testing the version that
we want. At this point, we believe it’s better to submit all modified
packages explicitly via input:context_artifacts_ids
so that we are
sure of the .deb that we are submitting and testing with. That way we
can even test reverse dependencies before the modified package is
available in any repository.
This assumes that we can submit arbitrary .deb on the command line and that they are effectively used as part of the package setup.
autopkgtest is always run with the options --apt-upgrade
--output-dir=ARTIFACT-DIR --summary=ARTIFACT-DIR/summary --no-built-binaries
.
An artifact of category debian:autopkgtest
is generated and its
content is a copy of what’s available in the ARTIFACT-DIR
(except
for files in binaries/
, they are excluded to save space). The artifact has
“relates to” relationships for the artifacts used as input that are part of the
source package being tested.
The data
field of the artifact has the following structure:
results
: a dictionary with details about the tests that have been run. Each key is the name of the test (as shown in the summary file) and the value is another dictionary with the following keys:status
: one ofPASS
,FAIL
,FLAKY
orSKIPPED
details
: more details when available
cmdline
: the complete command line that has been used for the runsource_package
: a dictionary with some information about the source package hosting the tests that have been run. It has the following sub-keys:name
:the name of the source packageversion
: the version of the source packageurl
: the URL of the source package
architecture
: the architecture of the system where tests have been rundistribution
: the distribution of the system where tests have been run (formatted asVENDOR:CODENAME
)
Debootstrap task (NOT IMPLEMENTED YET)
The debootstrap
task implements the SystemBootstrap interface except that it only supports a single
repository in the bootstrap_repositories
key.
On top of the keys defined in that interface, it also supports the
following additional keys in task_data
:
bootstrap_options
script
: last parameter on debootstrap’s command line
The various keys in the first entry of bootstrap_repositories
are mapped to the
corresponding command line options and parameters:
mirror
,suite
andscript
map to positional command line parameterscomponents
maps to--components
check_signature
maps to--check-gpg
or--no-check-gpg
keyring_package
maps to an extra package name in--include
keyring
maps to--keyring
The following keys from bootstrap_options
are also mapped:
* variant
maps to --variant
* extra_packages
maps to --include
Mmdebstrap task (NOT IMPLEMENTED YET)
The mmdebstrap
task fully implements the SystemBootstrap interface.
The keys from bootstrap_options
are mapped to command line options:
variant
maps to--variant
(and it supports more values than debootstrap, see its manual page)extra_packages
maps to--include
The keys from bootstrap_repositories
are used to build a sources.list
file that is then fed to mmdebstrap
as input.
Lintian task
The task_data
associated to this task can contain the following keys:
input
(required): a dictionary of values describing the input data, one of the sub-keys is required but both can be given at the same time too.source_arfifact_id
(optional): the ID of the artifact representing the source package to be tested with lintianbinary_artifacts_ids
(optional): a list ofdebian:binary-packages
artifact IDs representing the binary packages to be tested with lintian (they are expected to be part of the same source package as the one identified withsource_artifact_id
)
Note
While it’s possible to submit only a source or only a single binary artifact, you should aim to always submit source + arch-all + arch-any related artifacts to have the best test coverage as some tags can only be emitted when lintian has access to all of them at the same time.
output
(optional): a dictionary of values controlling some aspects of the generated artifactssource_analysis
(optional, defaults to True): indicates whether we want to generate thedebian:lintian
artifact for the source packagebinary_all_analysis
(optional, defaults to True): same assource_analysis
but for thedebian:lintian
artifact related toArchitecture: all
packagesbinary_any_analysis
(optional, defaults to True): same assource_analysis
but for thedebian:lintian
artifact related toArchitecture: any
packages
target_distribution
(optional): the fully qualified name of the distribution that will provide the lintian software to analyze the packages. Defaults todebian:unstable
.min_lintian_version
(optional): request that the analysis be performed with a lintian version that is higher or equal to the version submitted. If a satisfying version is not pre-installed and cannot be installed withapt-get install lintian
, then the work request is aborted.include_tags
(optional): a list of the lintian tags that are allowed to be reported. If not provided (or empty), defaults to all. Translates into the--tags
or--tags-from file
command line option.exclude_tags
(optional): a list of the lintian tags that are not allowed to be reported. If not provided (or empty), then no tags are hidden. Translates into the--suppress-tags
or--suppress-tags-from file
command line option.fail_on_severity
(optional, defaults tonone
): if the analysis emits tags of that severity or higher, then the task will return a “failure” instead of a “success”. Valid values are (in decreasing severity) “error”, “warning”, “info”, “pedantic”, “experimental”, “overridden”. “none” is a special value indicating that we should never fail.
The lintian runs will always use the options --display-level
">=classification"
(>=pedantic
in jessie) --no-cfg
--display-experimental --info --show-overrides
to collect the full set of
data that lintian can provide.
Note
Current lintian can generate “masked” tags (with M: prefix) when you
use --show-overrides
. For the purpose of debusine, we entirely
ignore those tags on the basis that it’s lintian’s decision to hide
them (and not the maintainer’s decision) and as such, they don’t bring
any useful information. Lintian is full of exceptions to not emit some
tags and the fact that some tags rely on a modular exception mechanism
that can be diverted to generate masked tags is not useful to package
maintainers.
For those reasons, we suggested to lintian’s maintainers to entirely stop emitting those tags in https://bugs.debian.org/1053892
Between 1 to 3 artifacts of category debian:lintian
will be generated (one
for each source/binary package artifact submitted) and they will have a
“relates to” relationship with the corresponding artifact that has been
analyzed. These artifacts contain a lintian.txt
file with the raw
(unfiltered) lintian output and an analysis.json
file with the details
about all the tags discovered (in a top-level tags
key), some
statistics/summary (in a top-level summary
key) and a version
key
with the value 1.0
if the content follows the (initial) JSON structure
described below.
The summary
key is also duplicated in the data
field of the
artifact. It is a dictionary with the following keys:
tags_count_by_severity
: a dictionary with a sub-key for each of the possible severities documenting the number of tags of the corresponding severity that have been emitted by lintianpackage_filename
: a dictionary mapping the name of the binary or source package to its associated filename (will be a single key dictionary for the case of a source package lintian analysis, and a multiple keys one for the case of an analysis of binary packages)tags_found
: the list of non-overriden tags that have been found during the analysisoverridden_tags_found
: the list of overridden tags that have been found during the analysislintian_version
: the lintian version used for the analysisdistribution
: the distribution in which lintian has been run
The tags
key in analysis.json is a sorted list of tags where each tag
is represented with a dictionary. The list is sorted by the following
criteria:
binary package name in alphabetical order (if relevant)
severity (from highest to lowest)
tag name (alphabetical order)
tag details (alphabetical order)
Each tag is represented with the following fields:
tag
: the name of the tagseverity
: one of the possible severities (see below for full list)package
: the name of the binary or source package (there is no risk of confusion between a source and a binary of the same name as the artifact with the analysis is dedicated either to a source packages or to a set of binary packages, but not to both at the same time)note
: the details associated to the tag (those are printed after the tag name in the lintian output)pointer
: the optional part shown between angle brackets that gives a specific location for the issue (often a filename and a line number)explanation
: the long description shown after a tag with--info
, aka the lines prefixed withN:
(they always start and end with an empty line)comment
: the maintainer’s comment shown on lines prefixed withN:
just before a given overridden tag (those lines can be identified by the lack of an empty line between them and the tag)
Note
Here’s the ordered list of all the possible severities (from highest to lowest):
error
warning
info
pedantic
experimental
overridden
classification
Note that experimental
and overridden
are not true tag
severities, but lintian’s output replaces the usual severity field
for those tags with X
or O
and it is thus not easily possible
to capture the original severity.
And while classification
is implemented like a low-severity issue,
those tags do not represent real issues, they are just a convenient way
to export data generated while doing the analysis.
Sbuild task
Regarding inputs, the sbuild
task is compatible with the ontology
defined for Task PackageBuild even though it implements only
a subset of the possible options at this time.
Todo
Document the outputs of the task (artifact and their relationships)
Piuparts task
A specific task to represent a binary package check using the
piuparts
utility.
The task_data
associated to this task can contain the following keys:
input
(required): a dictionary describing the input databinary_artifacts_ids
(required): a list ofdebian:binary-packages
artifact IDs representing the binary packages to be tested. Multiple Artifacts can be provided so as to support e.g. testing binary packages from split indep/arch builds.
distribution
(required): name of the target distribution.host_architecture
(required): the architecture that we want to test on.
The piuparts
output will be provided as a new artifact.
Internal API: debusine.tasks
Collection of tasks.
The debusine.tasks module hierarchy hosts a collection of Task
that are
used by workers to fulfill WorkRequest sent by the debusine scheduler.
Creating a new task requires adding a new file containing a class inheriting
from the Task
base class. The name of the class must be unique among
all child classes.
A child class must, at the very least, override the Task.execute()
method.
- class debusine.tasks.Task[source]
Base class for tasks.
A Task object serves two purpose: encapsulating the logic of what needs to be done to execute the task (cf
configure()
andexecute()
that are run on a worker), and supporting the scheduler by determining if a task is suitable for a given worker. That is done in a two-step process, collating metadata from each worker (with theanalyze_worker()
method that is run on a worker) and then, based on this metadata, see if a task is suitable (withcan_run_on()
that is executed on the scheduler).- TASK_DATA_SCHEMA: dict[str, Any] = {}
Can be overridden to enable jsonschema validation of the
task_data
parameter passed toconfigure()
.
- TASK_VERSION: Optional[int] = None
Must be overridden by child classes to document the current version of the task’s code. A task will only be scheduled on a worker if its task version is the same as the one running on the scheduler.
- property aborted: bool
Return if the task is aborted.
Tasks cannot transition from aborted -> not-aborted.
- analyze_worker() dict [source]
Return dynamic metadata about the current worker.
This method is called on the worker to collect information about the worker. The information is stored as a set of key-value pairs in a dictionary.
That information is then reused on the scheduler to be fed to
can_run_on()
and determine if a task is suitable to be executed on the worker.Derived objects can extend the behaviour by overriding the method, calling
metadata = super().analyze_worker()
, and then adding supplementary data in the dictionary.To avoid conflicts on the names of the keys used by different tasks you should use key names obtained with
self.prefix_with_task_name(...)
.- Returns:
a dictionary describing the worker.
- Return type:
dict.
- classmethod analyze_worker_all_tasks()[source]
Return dictionary with metadata for each task in Task._sub_tasks.
Subclasses of Task get registered in Task._sub_tasks. Return a dictionary with the metadata of each of the subtasks.
This method is executed in the worker when submitting the dynamic metadata.
- append_to_log_file(filename: str, lines: list[str]) None [source]
Open log file and write contents into it.
- Parameters:
filename – use self.open_debug_log_file(filename)
lines – write contents to the logfile
- can_run_on(worker_metadata: dict) bool [source]
Check if the specified worker can run the task.
This method shall take its decision solely based on the supplied
worker_metadata
and on the configured task data (self.data
).The default implementation returns always True except if there’s a mismatch between the :py:attribute:TASK_VERSION on the scheduler side and on the worker side.
Derived objects can implement further checks by overriding the method in the following way:
if not super().can_run_on(worker_metadata): return False if ...: return False return True
- Parameters:
worker_metadata (dict) – The metadata collected from the worker by running
analyze_worker()
on all the tasks on the worker under consideration.- Returns:
the boolean result of the check.
- Return type:
bool.
- static class_from_name(sub_task_class_name: str) Type[Task] [source]
Return class for :param sub_task_class_name (case-insensitive).
__init_subclass__() registers Task subclasses’ into Task._sub_tasks.
- configure(task_data)[source]
Configure the task with the supplied
task_data
.The supplied data is first validated against the JSON schema defined in the TASK_DATA_SCHEMA class attribute. If validation fails, a TaskConfigError is raised. Otherwise, the supplied task_data is stored in the data attribute.
Derived objects can extend the behaviour by overriding the method and calling
super().configure(task_data)
however the extra checks must not access any resource of the worker as the method can also be executed on the server when it tries to schedule work requests.- Parameters:
task_data (dict) – The supplied data describing the task.
- Raises:
TaskConfigError – if the JSON schema is not respected.
- execute() bool [source]
Call the _execute() method, upload debug artifacts.
See _execute() for more information.
- Returns:
result of the _execute() method.
- find_file_by_suffix(directory: Path, suffix: str) Optional[Path] [source]
Find file in directory with the specified suffix.
If there is no file ending with suffix or there is more than one file: return None and write a log in the directory.
- Parameters:
directory – directory to find the file. Not recursive.
suffix – suffix to find.
- Returns:
file path or None
- static is_valid_task_name(task_name) bool [source]
Return True if task_name is registered (its class is imported).
- logger
A
logging.Logger
instance that can be used in child classes when you override methods to implement the task.
- name
The name of the task. It is computed by
__init__()
by converting the class name to lowercase.
- open_debug_log_file(filename: str, *, mode='a') Union[TextIO, BinaryIO] [source]
Open a temporary file and return it.
The files are always for the same temporary directory, calling it twice with the same file name will open the same file.
The caller must call .close() when finished writing.
Task to build Debian packages with sbuild.
This task module implements the PackageBuild ontology for its task_data: https://freexian-team.pages.debian.net/debusine/design/ontology.html#task-packagebuild
- class debusine.tasks.sbuild.Sbuild[source]
Task implementing a Debian package build with sbuild.
- TASK_DATA_SCHEMA: dict[str, Any] = {'additionalProperties': False, 'properties': {'build_components': {'items': {'enum': ['any', 'all', 'source']}, 'type': 'array', 'uniqueItems': True}, 'distribution': {'type': 'string'}, 'host_architecture': {'type': 'string'}, 'input': {'additionalProperties': False, 'properties': {'source_artifact_id': {'type': 'integer'}}, 'required': ['source_artifact_id'], 'type': 'object'}, 'sbuild_options': {'items': {'type': 'string'}, 'type': 'array'}}, 'required': ['input', 'distribution', 'host_architecture'], 'type': 'object'}
Can be overridden to enable jsonschema validation of the
task_data
parameter passed toconfigure()
.
- TASK_VERSION: Optional[int] = 1
Must be overridden by child classes to document the current version of the task’s code. A task will only be scheduled on a worker if its task version is the same as the one running on the scheduler.
- can_run_on(worker_metadata: dict) bool [source]
Check the specified worker can run the requested task.
- configure_for_execution(download_directory: Path) bool [source]
Configure Task: set variables needed for the build() step.
Return True if configuration worked, False, if there was a problem.
Note: self.find_file_by_suffix() write to a log file to be uploaded as artifact.