.. _reference-metrics:
Available metrics
-----------------
Debusine supports `Prometheus `__ monitoring using
the ``/api/1.0/open-metrics/`` endpoint. By default this is available to
anyone who can access the instance, and will reveal information such as
valid scope names and queue sizes; administrators whose instances have
significant commercially-sensitive activity may wish to restrict public
access to that endpoint in their front-end load balancer configuration.
Note that the metric names below may have an additional suffix depending on
the metric type and the library being used to parse the raw metrics output:
counter metrics have a ``_total`` suffix, while histogram metrics have a
``_bucket`` suffix.
Database metrics
================
The following metrics measure various kinds of objects in Debusine's
database:
.. list-table::
:widths: 20 20 15 45
:header-rows: 1
* - Name
- Labels
- Type
- Description
* - ``artifacts``
- ``category``, ``scope``
- gauge
- number of :ref:`artifacts ` of each category
in each scope
* - ``assets``
- ``category``, ``scope``
- gauge
- number of :ref:`assets ` of each category in
each scope
* - ``collections``
- ``category``, ``scope``
- gauge
- number of :ref:`collections ` of each
category in each scope
* - ``file_store_max_size``
- ``backend``, ``name``
- gauge
- configured maximum sizes of :ref:`file stores
`
* - ``file_store_size``
- ``backend``, ``name``
- gauge
- total sizes of :ref:`file stores `
* - ``groups``
- ``ephemeral``, ``scope``
- gauge
- number of :ref:`groups ` in each scope;
ephemeral groups will be garbage-collected when empty
* - ``tokens``
- ``enabled``, ``user``, ``worker``
- gauge
- number of tokens associated with users and workers
* - ``users``
- ``active``
- gauge
- number of users, excluding the system user
* - ``user_activity``
- ``scope``
- histogram
- number of users who have created a workflow in each scope over the
last N days
* - ``user_identities``
- ``active``, ``issuer``
- gauge
- number of user single-sign-on identities by issuer
* - ``user_identities_activity``
- ``issuer``, ``scope``
- histogram
- number of single-sign-on identities by issuer for users who have
created a workflow in each scope over the last N days
* - ``work_requests``
- ``task_type``, ``task_name``, ``scope``, ``status``,
``build_architecture``, ``backend``
- gauge
- number of :ref:`work requests ` broken
down in various ways
* - ``workers``
- ``connected``, ``busy``, ``worker_type``, ``worker_pool``,
``native_architecture``, ``architecture_*``
- gauge
- number of :ref:`workers ` broken down in
various ways
* - ``worker_pool_runtime``
- ``worker_pool``
- gauge
- time in seconds spent running tasks per :ref:`worker pool
`
* - ``workflow_templates``
- ``task_name``, ``scope``
- gauge
- number of :ref:`workflow templates `
for each :ref:`task name ` in each scope
* - ``workspaces``
- ``private``, ``expires``, ``scope``
- gauge
- number of :ref:`workspaces ` in each scope
Operational metrics
===================
Debusine exports many metrics provided by `django-prometheus
`__, which are
generally self-documented by ``HELP`` lines in the raw metrics output. Some
particularly useful ones are:
.. list-table::
:widths: 20 20 15 45
:header-rows: 1
* - Name
- Labels
- Type
- Description
* - ``django_http_requests_latency_seconds_by_view_method``
-
- histogram
- Request processing time (including middleware)
* - ``django_http_responses_total_by_status_view_method``
- ``method``, ``status``, ``view``
- counter
- Count of responses by method, status, and view
Other operational metrics are provided by Debusine itself:
.. list-table::
:widths: 20 20 15 45
:header-rows: 1
* - Name
- Labels
- Type
- Description
* - ``debusine_active_websocket_connections``
- ``view``
- gauge
- Active connections to the Debusine server websocket
* - ``debusine_signon_oidc_callbacks``
- ``issuer``, ``activated``
- counter
- OpenID Connect authentication callbacks by issuer and whether the
identity activation succeeded
* - ``debusine_websocket_connect_requests``
- ``view``
- counter
- Total connections to the Debusine server websocket
* - ``debusine_work_request_start_latency``
- ``task_type``, ``task_name``, ``priority``
- histogram
- Time in seconds that work requests spent waiting to start
* - ``debusine_workflow_orchestrator_duration``
- ``task_name``
- histogram
- Time in seconds that workflow orchestrators took to run