Collection of tasks.
The debusine.tasks module hierarchy hosts a collection of
Task that are
used by workers to fulfill WorkRequest sent by the debusine scheduler.
Creating a new task requires adding a new file containing a class inheriting
Task base class. The name of the class must be unique among
all child classes.
A child class must, at the very least, override the
Base class for tasks.
A Task object serves two purpose: encapsulating the logic of what needs to be done to execute the task (cf
execute()that are run on a worker), and supporting the scheduler by determining if a task is suitable for a given worker. That is done in a two-step process, collating metadata from each worker (with the
analyze_worker()method that is run on a worker) and then, based on this metadata, see if a task is suitable (with
can_run_on()that is executed on the scheduler).
Can be overridden to enable jsonschema validation of the
task_dataparameter passed to
Must be overridden by child classes to document the current version of the task’s code. A task will only be scheduled on a worker if its task version is the same as the one running on the scheduler.
Return if the task is aborted.
Tasks cannot transition from aborted -> not-aborted.
analyze_worker() → dict[source]¶
Return dynamic metadata about the current worker.
This method is called on the worker to collect information about the worker. The information is stored as a set of key-value pairs in a dictionary.
That information is then reused on the scheduler to be fed to
can_run_on()and determine if a task is suitable to be executed on the worker.
Derived objects can extend the behaviour by overriding the method, calling
metadata = super().analyze_worker(), and then adding supplementary data in the dictionary.
To avoid conflicts on the names of the keys used by different tasks you should use key names obtained with
a dictionary describing the worker.
- Return type
Return dictionary with metadata for each task in Task._sub_tasks.
Subclasses of Task get registered in Task._sub_tasks. Return a dictionary with the metadata of each of the subtasks.
This method is executed in the worker when submitting the dynamic metadata.
can_run_on(worker_metadata: dict) → bool[source]¶
Check if the specified worker can run the task.
This method shall take its decision solely based on the supplied
worker_metadataand on the configured task data (
The default implementation returns always True except if there’s a mismatch between the :py:attribute:TASK_VERSION on the scheduler side and on the worker side.
Derived objects can implement further checks by overriding the method in the following way:
if not super().can_run_on(worker_metadata): return False if ...: return False return True
worker_metadata (dict) – The metadata collected from the worker by running
analyze_worker()on all the tasks on the worker under consideration.
the boolean result of the check.
- Return type
class_from_name(sub_task_class_name: str) → Type[debusine.tasks._task.Task][source]¶
Return class for :param sub_task_class_name (case-insensitive).
__init_subclass__() registers Task subclasses’ into Task._sub_tasks.
Configure the task with the supplied
The supplied data is first validated against the JSON schema defined in the TASK_DATA_SCHEMA class attribute. If validation fails, a TaskConfigError is raised. Otherwise, the supplied task_data is stored in the data attribute.
Derived objects can extend the behaviour by overriding the method and calling
super().configure(task_data)however the extra checks must not access any resource of the worker as the method can also be executed on the server when it tries to schedule work requests.
task_data (dict) – The supplied data describing the task.
TaskConfigError – if the JSON schema is not respected.
Set the object to access the server.
Validated task data submitted through
execute() → bool[source]¶
Execute the requested task.
The task must first have been configured. It is allowed to take as much time as required. This method will only be run on a worker. It is thus allowed to access resources local to the worker.
It is recommended to fail early by raising a :py:exc:TaskConfigError if the parameters of the task let you anticipate that it has no chance of completing successfully.
True to indicate success, False for a failure.
- Return type
TaskConfigError – if the parameters of the work request are incompatible with the worker.
is_valid_task_name(task_name) → bool[source]¶
Return True if task_name is registered (its class is imported).
logging.Loggerinstance that can be used in child classes when you override methods to implement the task.
The name of the task. It is computed by
__init__()by converting the class name to lowercase.
prefix_with_task_name(text: str) → str[source]¶
textprefixed with the task name and a colon.