Database as a Work Queue

Michael Gluckman
2 min readFeb 16, 2022

Sometimes you don’t need a complicated work queue system when a database would be sufficient.

Some of the benefits of using a database as a work queue:

  • Visibility of the actual task content
  • Query-able to allow you to find and query task content
  • Easy to reset a task status in the table to re-fire in the event of an error

We use it to persist domain events into the the work queue and execute tasks asynchronously. You could very well use rabbitMQ for this purpose but our throughput is not an issue for us.

Java Polling Framework

** We adapted the excellent database queue library from yoomoney for our specific needs.

Work Queue Table Schema

The columns to note are:

  • publish_event_task_payload (contains the actual task payload)
  • task_status (scheduled, failed, processed)
  • max_task_attempt (max task attempts before task_status set to failed)
  • next_process_at (when the task is scheduled to be processed)
  • attempt (current number of attempts)

Outbox pattern for peace of mind

A service must atomically update the database and send messages in order to avoid data inconsistencies and bugs. A service that uses a relational database inserts messages/events into an outbox table (e.g. MESSAGE) as part of the local transaction.
https://microservices.io/patterns/data/transactional-outbox.html

This ensures that we get those domain events written as part of the local transaction.

Postgres Skip Lock

With SKIP LOCKED, any selected rows that cannot be immediately locked are skipped. Skipping locked rows provides an inconsistent view of the data, so this is not suitable for general purpose work, but can be used to avoid lock contention with multiple consumers accessing a queue-like table.
https://www.postgresql.org/docs/9.5/sql-select.html

A polling framework thread updates the task status to processing to prevent other threads from locking the row for update (skip part). Processing a task happens within the context of a database transaction so if the task processing fails the task is rescheduled and reprocessed by another thread.

The framework supports multiple strategies to re-enqueue tasks e.g. time, geometric progression.

Most of our tasks invoke grpc service calls. It is important to ensure that your grpc services are idempotent in the event that a task refires.

--

--