Live Log Streaming

Debusine lets you view the full log of a task after it finishes. But while a task is still running, there is nothing to see. You have to wait until it completes before you can tell whether anything went wrong. For long-running tasks like package builds, this means you might wait twenty minutes only to discover a failure that happened in the first two.

This blueprint describes how to make task output visible in real time, while the task is still running.

Architecture Overview

The change involves three layers working together:

Worker: as a task runs, the worker reads output from the subprocess and accumulates it in a buffer. The buffer is flushed to the server over the existing WebSocket connection as a log_data message whenever either a size threshold (64 KB) or a time threshold (1 second) is reached, whichever comes first.
Server: as log_data messages arrive, the server writes each chunk into a Redis Stream, one entry per message, keyed by the work request ID. Any number of readers can then consume from that stream independently.
Browser: a small JavaScript snippet on the task detail page subscribes to a server endpoint that reads from the Redis Stream and pushes chunks down to the browser as they arrive. The page shows the output appearing line by line, without any refresh.

The overall flow looks like this:

Worker subprocess
      |
      | (time/size-bounded chunks)
      v
Worker WebSocket client  -- log_data message -->  Server WebSocket handler
                                                          |
                                                          | xadd
                                                          v
                                                    Redis Stream
                                                  task:logs:{id}
                                                          |
                                                          | xread (blocking)
                                                          v
                                                  Django streaming view
                                                  (new WebSocket consumer)
                                                          |
                                                          | WebSocket
                                                          v
                                                    Browser (task page)

SERVER task (runs on Celery worker, no WebSocket connection)
      |
      | xadd (directly)
      v
Redis Stream  task:logs:{id}  (same path from here onward)

SERVER tasks run on a Celery worker and do not have a WebSocket connection to the server. They write log data directly to Redis instead of routing them through the worker WebSocket handler. From the Redis Stream onward, the flow is identical to the worker path above.

Worker-Side Changes

The worker already runs tasks as subprocesses. Tasks are defined under debusine/task/ and are executed through the executor layer at debusine/task/executor/. The change here is to capture output from that execution incrementally, rather than waiting for the task to finish.

There are two possible approaches to capturing the output:

Option A: Stream from the subprocess directly: the executor runs subprocesses with their stdout and stderr exposed as byte streams. The worker reads from those streams and feeds the bytes into a time/size-bounded buffer, which is flushed to the server as a log_data message whenever 64 KB accumulates or 1 second elapses. This may require some restructuring of the executor API. Stdout and stderr are buffered independently so their stream labels are preserved in each message.
Option B: Watch the log file: if the executor already writes output to a log file on disk, the worker can tail that file while the task runs and forward new lines as they appear. This avoids touching the executor internals and may be simpler to integrate. It also opens the door to streaming multiple log files per job in the future, since the watcher is not tied to a single subprocess stream. So we can get multiple log files per task.

The chosen approach is Option A. This gives a single combined log stream per job. Option B was considered and discarded. While it avoids touching the executor internals, it introduces filesystem indirection and makes it harder to reason about ordering and completeness. The simpler, more direct approach is preferred.

The stream should include more than just the subprocess output. Internal worker events like: setting up the executor, downloading input artifacts, uploading result artifacts are useful context for someone watching a task run. These events will be written into the same stream as stdout and stderr, tagged with an identifier so the browser can display or filter them separately.

The worker does not send output to the WebSocket as each line arrives. Instead, it accumulates output in a buffer and flushes it as a log_data message when either of two conditions is met: the buffer has grown to 64 KB, or 1 second has elapsed since the last flush, whichever comes first.

Each message carries the raw accumulated bytes in a data field, which may span multiple lines and may end mid-line depending on when the flush threshold was reached. The message looks like this:

{
    "type": "log_data",
    "work_request_id": 42,
    "data": "Building package foo 1.2.3...\nChecking dependencies...\n",
    "timestamp": "2026-05-09T14:23:01+00:00",
    "stream": "stdout"
}

The stream field distinguishes between stdout, stderr, and internal, so the server and browser can display them differently if needed (for example, showing stderr lines in a different colour). Because a single chunk may contain only one stream type, stdout and stderr have separate buffers. However, whenever new output arrives on one stream, all other streams are flushed first, so that interleaving is preserved in the order the server receives messages.

If the buffer fills up faster than the WebSocket can drain it (for example, a task emitting output at very high speed), the worker will drop buffered data rather than block the task, and send a synthetic notice instead:

{
    "type": "log_data",
    "work_request_id": 42,
    "data": "[some data skipped]",
    "timestamp": "2026-05-09T14:23:01+00:00",
    "stream": "internal"
}

This keeps the worker from falling behind or consuming unbounded memory, while still giving the user a visible signal that some output was lost.

If the WebSocket connection drops mid-task, the worker should not crash. It should log the failure locally and continue running the task. Losing the live stream is acceptable but losing the task result is not.

When the task finishes, the worker sends a final log_data_end message with the exit code, so the server knows the stream is complete:

{
    "type": "log_data_end",
    "work_request_id": 42,
    "exit_code": 0,
    "timestamp": "2026-05-09T14:25:00+00:00"
}

For reference, here are representative examples of each message variant the worker may send:

stdout chunk: normal task output:

{
    "type": "log_data",
    "work_request_id": 42,
    "data": "Building package foo 1.2.3...",
    "timestamp": "2026-05-09T14:23:01+00:00",
    "stream": "stdout"
}

stderr chunk: error or diagnostic output from the subprocess:

{
    "type": "log_data",
    "work_request_id": 42,
    "data": "warning: deprecated function used",
    "timestamp": "2026-05-09T14:23:05+00:00",
    "stream": "stderr"
}

internal chunk: worker lifecycle events (setup, artifact upload, etc.):

{
    "type": "log_data",
    "work_request_id": 42,
    "data": "Downloading input artifact foo.dsc",
    "timestamp": "2026-05-09T14:22:58+00:00",
    "stream": "internal"
}

skipped-messages notice: emitted when the buffer overflows:

{
    "type": "log_data",
    "work_request_id": 42,
    "data": "[some data skipped]",
    "timestamp": "2026-05-09T14:23:10+00:00",
    "stream": "internal"
}

Server-Side: Storing Logs in Redis

When the server receives a log_data message, it writes the data into a Redis Stream. Each work request gets its own stream, keyed by its ID:

task:logs:{work_request_id}

So for work request 42, the key would be task:logs:42.

Each entry in the stream stores three fields:

data      ->  raw chunk of log output (may span multiple lines or end mid-line)
timestamp ->  ISO 8601 UTC timestamp from the worker, taken at flush time
              (represents when the last byte in the chunk arrived)
stream    ->  "stdout", "stderr", or "internal"

Redis Streams are a good fit here because they are an ordered, persistent log. A consumer that connects late can still request all entries from the beginning by starting from ID 0. This is different from Redis Pub/Sub, where late consumers miss anything sent before they connected, which would be wrong for a task log where you want the full output from the start.

To prevent the stream from growing without bound for very verbose tasks, the server applies a maximum length when writing:

r.xadd(f"task:logs:{work_request_id}", entry, maxlen=10000, approximate=True)

When the task completes, the server uploads the full log content as an artifact. After that, the Redis key is deleted, since the durable copy now lives in the artifact store.

Server-Side: The Streaming View

The server needs a view that a browser can subscribe to and receive log chunks from as they arrive. This view reads from the Redis Stream for the requested work request and pushes each chunk down to the client.

Debusine already has two WebSocket consumers: one for the worker connection, and one for clients waiting for a job to complete. The streaming view will be a third WebSocket consumer added to that same file. It handles browser connections that want to watch a running task’s output. This transport was chosen over Server-Sent Events because WebSockets are bidirectional, leaving the door open for client-to-server feedback in the future. For example, backpressure signalling or explicit acknowledgements from the browser.

The view works as follows, Given a work_request_id, it:

Checks that the requesting user has permission to view that work request.
Opens a blocking xread loop on task:logs:{work_request_id}, starting from ID 0 to get all chunks from the beginning.
Sends each chunk to the client as it arrives.
Stops when the work request status moves to completed or aborted, and closes the connection.

If the work request is already completed when the browser subscribes, the view reads the full stream from Redis (if the key still exists) or falls back to the artifact. This way, the same view works for both live and recently-finished tasks.

Note

The WebSocket protocol between the server and the browser (the exact message format, event types, and connection lifecycle) will be designed in detail at a later stage, once the worker and server storage layers are in place.

Log Persistence

When a task finishes, the live Redis Stream has served its purpose. The full log needs to move somewhere durable before the stream is cleaned up.

When a task finishes, the existing system already collects the task output and uploads it as an artifact attached to the work request. That behavior does not change.

After the existing artifact upload completes successfully, the Redis key is deleted since the durable copy now lives in the artifact store.

A hard expiry is also set on the Redis key as a safety net, independent of whether the artifact upload succeeds. This prevents orphaned streams from accumulating in Redis if something goes wrong silently:

r.expire(f"task:logs:{work_request_id}", 60 * 60 * 24)  # 24 hours

Browser-Side

The worker and server changes in the previous steps are the core of the feature. The browser side makes it user-visible.

The task detail page gets a small vanilla JavaScript snippet. No framework, no build step. When the page loads for a task that is currently running, the snippet opens a WebSocket connection to the new streaming consumer and appends each incoming message to the page.

A few small things worth handling:

Auto-scroll: the page should scroll to the bottom as new chunks arrive, so the user always sees the latest output without manual scrolling.
stderr styling: if the server includes a stream field in the SSE event, chunks from stderr can be given a different style (a muted colour or a small label) so they are visually distinct from stdout.
Already-completed tasks: if the page loads for a task that just finished, the same endpoint serves the full log from Redis or from the artifact, so the snippet does not need any special case.