zlacker

I'm not opposed to a separate table or DB used exclusively for messaging separate from business level data, but I think I've never heard of anyone doing that in practice. It's almost always a bunch of hacky joins someone wants to do with the messages and the messaging gets directly coupled with the schema (foreign key constraints on transaction or event IDs and business entities is the usual case). Letting one or two things be used for purposes it wasn't necessarily designed for is alright but coupling the same place that has your OLTP together as your messaging system in itself is usually a culture where the same DB will be used in another month as a time series DB, then as an OLAP system, and then there's no time to properly decouple anything anymore / actually understand your needs and nobody wants to try to touch the very, very brittle DB that runs the entire business on its back.

It is a very, very slippery slope and it's seductive to put everything into Postgres at low scale but I've seen more places sunk by technical debt (because they can't move fast enough to catch up to parity with market demands, can't scale operationally with the budget, or address competitors' features) than ones that don't get enough traction where technical debt becomes a problem. But perhaps my experience is perversely biased and I'm just driven to over-engineer stuff because I've been routinely handed the reins of quite literally business non-viable software systems that were over-sold repeatedly while Silicon Valley sounds like it has the inverse problem of over-engineering for non-viable businesses.

replies(1): >>andrew+G

>>devonk+(OP)
I think you're saying it's not a good approach because developers will do it wrong?

That's hard to argue with.

The example code shows how it should be done - a simple table dedicated to messages which supplies the id of the work/job to be carried out.

Not much can be done architecturally to address developers messing that up.

replies(1): >>devonk+EV

>>andrew+G
While the pattern is simple programmatically, it is insufficient operationally and sets a precedent of coupling your data plane and your messaging plane.

Looking solely from a code perspective, how would this be done prior to Postgres 9.5 (before SKIP LOCKED)? Of the four examples I saw (either MySQL or Postgres), all of them choked constantly and had a pathological case of a job failing to execute expediently during peak usage which made on-call a nightmare. So what do we do about the places that can't upgrade their Postgres databases because it's too complicated to decouple messaging and business data now? Based upon my observations the answer is "it's never done" or the company goes under from lack of ability to enact changes caused by the paralysis.

The reality is that simple jobs are almost never enough outside toy examples. Soon, someone wants to be notified about job status changes - then comes an event table with a foreign key constraint on the job table's primary key. Also add in job digraphs with sub-tasks and workflows (AKA weak transactions) - these are non-trivial to do well. Depending upon your SQL variant and version combined with access patterns, row level locking and indexing can be an O(1) lookup or an O(n^2) nightmare that causes tons of contention depending (once again) upon the manner in which jobs are modified.

Instead of thinking about "what are my transaction boundaries and what do my asynchronous jobs look like?" which are much more invariant and important to the business trying to map their existing solution onto as many problems as possible. Then it should be more clear whether you would be fine with a RDBMS table, Airflow, SQS, Redis, RabbitMQ, JMS, etc. Operations-wise more components is certainly a headache, but I've had more headaches in production due to inappropriate technology for the problem domain than "this is just too many parts!"

replies(1): >>anaraz+3i1

>>devonk+EV
> Looking solely from a code perspective, how would this be done prior to Postgres 9.5 (before SKIP LOCKED)?

It's possible to implement SKIP LOCKED in userland in PostgreSQL (using NOWAIT, which has been in PG), although it's obviously a bit slower.