Artifacts
The categorization of artifacts does not enforce anything on the structure of the associated files and key-value data. However, there must be some consistency and rules to be able to a make a meaningful use of the system.
This document presents the various categories that we use to manage a Debian-based distribution. For each category, we explain:
what associated files you can find
what key-value data you can expect
what relationships with other artifacts are likely to exist
Data key names are used in pydantic
models, and must therefore be valid
Python identifiers.
Relationships
Artifacts may have the following relations with each others:
built-using: indicates that the build of the artifact used the target artifact (ex: “binary-packages” artifacts are built using “source-package” artifacts)
extends: indicates that the artifact is extending the target artifact in some way (ex: a “source-upload” artifact extends a “source-package” artifact with target distribution information)
relates-to: indicates that the artifact relates to another one in some way (ex: a “binary-upload” artifact relates-to a “binary-package”, or a “package-build-log” artifact relates to a “binary-package”).
Category debian:source-package
This artifact represents a set of files that can be extracted in some
way to provide a file hierarchy containing source code that can be built
into debian:binary-package
artifact(s).
Data:
name: the name of the source package
version: the version of the source package
type: the type of the source package
dpkg
for a source package that can be extracted withdpkg-source -x
on the.dsc
file
dsc_fields: a parsed version of the fields available in the .dsc file
Files: for the
dpkg
type, a.dsc
file and all the files referenced in that fileRelationships:
built-using: in the case of a source package that was assembled automatically after signing files, the debian:binary-package artifacts that contain the corresponding unsigned files
Category debian:binary-package
This artifact represents a single binary package (a .deb
file or
similar) produced during the build of a source package for a given
architecture.
If the build of a source-package produces more than one binary for a given
architecture, or binaries of more than one architecture, one
debian:binary-package
artifact is created for each binary and
architecture.
Data:
srcpkg_name: the name of the source package
srcpkg_version: the version of the source package
deb_fields: a parsed version of the fields available in the
.deb
’s control filedeb_control_files: a list of the files in the
.deb
’s control part
Files: a
.deb
fileRelationships:
built-using: the corresponding
debian:source-package
built-using: other
debian:binary-package
(for example in the case of signed packages duplicating the content of an unsigned package)built-using: other
debian:source-package
(general case of Debian’sBuilt-Using
field)
Category debian:binary-packages
This artifact represents the set of binary packages (.deb
files and
similar) produced during the build of a source package for a given
architecture.
If the build of a source-package produces binaries of more than one
architecture, one debian:binary-packages
artifact is created for each
architecture, listing only the binary packages for that architecture.
Data:
srcpkg_name: the name of the source package
srcpkg_version: the version of the source package
version: the version used for the build (can be different from the source version in case of binary-only rebuilds; note that individual binary packages may have versions that differ from this if the source package uses
dpkg-gencontrol -v
)architecture: the architecture that the packages have been built for. Can be any real Debian architecture or
all
.packages: the list of binary packages that are part of the build for this architecture.
Files: one or more
.deb
filesRelationships:
built-using: the corresponding
debian:source-package
built-using: other
debian:binary-package
(for example in the case of signed packages duplicating the content of an unsigned package)built-using: other
debian:source-package
(general case of Debian’sBuilt-Using
field)
Category debian:upload
This artifact represents an upload of source and/or binary packages.
Currently uploads are always represented with .changes
file but the
structure of the artifact makes it possible to represent other kind of
uploads in the future (like uploads with signed git tags, or some
debusine native internal upload).
Data:
type: the type of the source upload
dpkg
: for an upload generated out of a.changes
file created bydpkg-buildpackage
changes_fields: a parsed version of the fields available in the
.changes
file
Files:
a
.changes
fileAll files mentioned in the
.changes
file
Relationships:
extends: (optional) one
debian:source-package
extends: (optional) one or more
debian:binary-package
and/ordebian:binary-packages
Category debian:package-build-log
This artifact contains a package’s build log and some associated information about the corresponding package build. It is kept around for traceability and for diagnostic purposes.
Data:
source: name of the source package built
version: version of the source package built
filename: name of the log file
maybe other information extracted out of the build log (build time, disk space used, etc.)
Files:
a single file
.build
file
Relationships:
relates-to: one (or more)
debian:binary-package
and/ordebian:binary-packages
builtrelates-to: the corresponding
debian:source-package
(if built from a source package)
Category debian:system-tarball
This artifact contains a tarball of a Debian system. The tarball is
compressed with xz
.
Data:
filename: filename of the tarball inside the artifact (e.g. “system.tar.xz”).
vendor: name of the distribution vendor (can be found in ID field in /etc/os-release)
codename: name of the distribution used to bootstrap the system
mirror: URL of the mirror used to bootstrap the system
variant: value of
--variant
parameter of debootstrappkglist: a dictionary listing versions of installed packages (cf
dpkg-query -W
)architecture: the architecture of the Debian system
with_dev: boolean value indicating whether
/dev
has been populated with the most important special files in/dev
(null, zero, full, random, urandom, tty, console, ptmx) as well as some usual symlinks (fd, stdin, stdout, stderr).with_init: boolean value indicating whether the system contains an init system in
/sbin/init
and can thus be “booted” in a container.
Files:
$filename
(e.g.system.tar.xz
): tarball of the Debian system
Relationships:
None.
Note
In preparation of support of different compression schemes, we have
decided that the extension of the filename dictates the compression
scheme used and that it should be compatible with tar
--auto-compress
.
Category debian:system-image
This artifact contains a disk image of a Debian system. The disk has a GPT partition table and is EFI bootable (for architectures that support EFI). The disk image should usually contain an EFI partition and a partition for the root filesystem. The root partition should use a partition type UUID respecting the discoverable partitions specification.
Data:
Same as
debian:system-tarball
with some extra fields. Thefilename
field points to the disk image.image_format: indicates the format of the image (e.g.
raw
,qcow2
)filesystem: indicates the filesystem used on the root filesystem (e.g.
ext4
,btrfs
,iso9660
)size: indicates the size of the filesystem on the root filesystem (in bytes)
boot_mechanism: a list of all the ways that the image can be booted. Valid values are
efi
andbios
.
Files:
$filename
(e.g.image.tar.xz
,image.qcow2
,image.iso
): the nature of the file depends on theimage_format
field.
Relationships:
None.
Note
At this point, we expect official Debusine tasks to only generate and
use images that are bootable with EFI. But the artifact specification
has the boot_mechanism
key to be future-proof and for the benefit
of custom tasks that would make different choices.
raw
image format
The image itself is wrapped in a xz-compressed tarball to be able to properly support sparse filesystem images (i.e. files with holes without any data) and to save some space with compression.
The filename
field points to the tarball that should contain a
root.img
file which is the raw disk image.
qcow2
image format
The filename
field points directly to the qcow2
image.
Category debusine:work-request-debug-logs
Data: empty
Files:
any number of files containing logs and information to help a debusine user understand the WorkRequest output: commands executed, output of the commands, etc.
Relationships:
relates-to: the corresponding
debian:source-package
(if built from a source package)
Category debian:blhc
Data: empty
Files:
One blhc output file called
blhc.txt
for the corresponding input build log.
Relationships:
relates-to: the corresponding
debian:package-build-log
input artifact.
Category debian:lintian
Files:
lintian.txt
: the raw (unfiltered) lintian outputanalysis.json
: the details about all the tags discovered (in a top-leveltags
key), some statistics/summary (in a top-levelsummary
key) and aversion
key with the value1.0
if the content follows the (initial) JSON structure described below.
Data:
summary
: a duplicate of thesummary
key inanalysis.json
analysis.json
structure:version
: always1.0
summary
: a dictionary with the following keys:tags_count_by_severity
: a dictionary with a sub-key for each of the possible severities documenting the number of tags of the corresponding severity that have been emitted by lintianpackage_filename
: a dictionary mapping the name of the binary or source package to its associated filename (will be a single key dictionary for the case of a source package lintian analysis, and a multiple keys one for the case of an analysis of binary packages)tags_found
: the list of non-overridden tags that have been found during the analysisoverridden_tags_found
: the list of overridden tags that have been found during the analysislintian_version
: the lintian version used for the analysisdistribution
: the distribution in which lintian has been run
tags
: a sorted list of tags where each tag is represented with a dictionary. The list is sorted by the following criteria:binary package name in alphabetical order (if relevant)
severity (from highest to lowest)
tag name (alphabetical order)
tag details (alphabetical order)
Each tag is represented with the following fields:
tag
: the name of the tagseverity
: one of the possible severities (see below for full list)package
: the name of the binary or source package (there is no risk of confusion between a source and a binary of the same name as the artifact with the analysis is dedicated either to a source packages or to a set of binary packages, but not to both at the same time)note
: the details associated to the tag (those are printed after the tag name in the lintian output)pointer
: the optional part shown between angle brackets that gives a specific location for the issue (often a filename and a line number)explanation
: the long description shown after a tag with--info
, aka the lines prefixed withN:
(they always start and end with an empty line)comment
: the maintainer’s comment shown on lines prefixed withN:
just before a given overridden tag (those lines can be identified by the lack of an empty line between them and the tag)
Note
Here’s the ordered list of all the possible severities (from highest to lowest):
error
warning
info
pedantic
experimental
overridden
classification
Note that experimental
and overridden
are not true tag
severities, but lintian’s output replaces the usual severity field
for those tags with X
or O
and it is thus not easily possible
to capture the original severity.
And while classification
is implemented like a low-severity issue,
those tags do not represent real issues, they are just a convenient way
to export data generated while doing the analysis.
Relationships:
relates-to
: the corresponding artifacts that have been analyzed. They can be of typedebian:source-package
,debian:binary-package
,debian:binary-packages
, ordebian:upload
.
Category debian:autopkgtest
Data:
results
: a dictionary with details about the tests that have been run. Each key is the name of the test (as shown in the summary file) and the value is another dictionary with the following keys:status
: one ofPASS
,FAIL
,FLAKY
orSKIP
details
: more details when available
cmdline
: the complete command line that has been used for the runsource_package
: a dictionary with some information about the source package hosting the tests that have been run. It has the following sub-keys:name
:the name of the source packageversion
: the version of the source packageurl
: the URL of the source package
architecture
: the architecture of the system where tests have been rundistribution
: the distribution of the system where tests have been run (formatted asVENDOR:CODENAME
)
Files:
Every file found in the autopkgtest output directory, except for files in
binaries/
that are excluded to save space.
Relationships:
relates-to
: the artifacts used as input that are part of the source package being tested. They can be of typesdebian:source-package
,debian:upload
,debian:binary-packages
ordebian:upload
Category debusine:signing-key
This artifact records the existence of a key in the signing service. To avoid expiry, these artifacts should also be added to the collection that they are intended to be used with; they are therefore valid items in debian:suite-signing-keys collections.
Data:
purpose
: the purpose of this key:uefi
oropenpgp
fingerprint
: the fingerprint of this keypublic_key
: the base64-encoded public key
Files: none
Relationships: none
Todo
We’ll eventually need to deal with removing private keys from the signing service when they’re no longer referenced by anything in the debusine database, but it’s not clear exactly what the lifetime rules should be. See https://salsa.debian.org/freexian-team/debusine/-/merge_requests/616#note_497019.
Category debusine:signing-input
This artifact provides input to a Sign task. It will typically be created by the ExtractForSigning task or the Sbuild task.
Data:
trusted_certs
: a list of SHA-256 fingerprints of certificates built into the signed code as roots of trust for verifying additional privileged code (see Describing the trust chain). If present, all the listed fingerprints must be listed in theDEBUSINE_SIGNING_TRUSTED_CERTS
Django setting. This is used to avoid accidentally creating trust chains from production to test signing certificates.binary_package_name
: the name of the binary package that this artifact was extracted from, if any
Files: one or more files to be signed
Relationships:
relates-to
: any other artifacts from which the files to be signed were extracted
Category debusine:signing-output
This artifact contains the output of a Sign task.
Data:
purpose
: the purpose of the key used to sign these files:uefi
oropenpgp
.fingerprint
: the fingerprint of the key used to sign these filesresults
: a list of dictionaries describing signed files, as follows (exactly one ofoutput_file
anderror_message
must be present):file
: name of the file that was signedoutput_file
: name of the file containing the signatureerror_message
: error message resulting from attempting to sign the file
Files:
zero or more files containing signatures
Relationships:
relates-to
: the debusine:signing-key used for signing, and the corresponding debusine:signing-input artifact
Category debusine:promise
This bare data item represents an artifact that will eventually be provided as the output of some existing work request. The bare data item may contain additional properties of said artifact to help differentiate them. These items are created by workflows when they populate their work request graphs.
Workflows add promises to their internal collection using the same name as they use for the expected artifact.
Data:
promise_work_request_id
(integer, required): the ID of the work request that is expected to fulfil this promisepromise_workflow_id
(integer, required): the ID of the workflow work request that created this promisepromise_category
(string, required): the category of the artifact that is expected to be provided
Other data items should match the per-item data structure used when adding
the promised artifact to the workflow’s internal collection, which is
determined by the parent workflow based on its protocol for communicating
with other workflows. This allows constructing lookups that match either a
promise or the promised artifact. Names starting with promise_
are
reserved.
When a work request is retried, it must update the work request ID in any associated promises.
Category debian:debdiff
Data:
original
: The name of the first file passed to debdiff.new
: The name of the second file passed to debdiff.
Files:
One debdiff output file called
debdiff.txt
for the corresponding two source or binary packages.
Relationships:
relates-to
: twodebian:source-package
ordebian:binary-package