But I think a lot of it is also about knowledge and documentation. If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it. Yes maintenance is higher, but I can get set up with redis, SQS, any of the 'scalable' solutions within a few hours of copy-pasting commands and code and configuration from a reputable source.
If I want to use NOTIFY in postgres? I googled "SQLALchemy notify listen postgres" and I find a few unanswered stackoverflow questions and a github gist that has some code but no context.
I would honestly love to use this approach for a side project, but I don't have 2-3 days to figure it out on my own. The direct choice for me might seem to be
* simple, but not scalable (ie just use postgres)
* complex, but scalable (ie redis, sqs, whatever)
and then it's a tradeoff, and the argument goes that I am blinded by cool tech and FAANG and I'm choosing complex but scalable, even though I don't need scalable.
But taking into account guides and other resources, the choice for me is actually
* complex and not scalable (this, because I don't know how to implement it and I can't predict what pitfalls I might face if I try)
* simple and scalable (what everyone actually does)
and that makes the engineer's choice to follow faang look a lot more reasonable.
Also: you may think that you may one day want to be hired by a FAANG.
The nice thing about "boring" tech like Postgres is that it has great documentation. So just peruse https://www.postgresql.org/docs/current/sql-notify.html . No need for google-fu.
If you want a job there, very relevant.
> I tend to reject FAANG recruiters because Leetcode
I understand the pain of leetcode interviews. They’re terrible. But optimizing your career based on the interview process seems… backwards?
FAANG companies (for example) are very relevant if you want to make a lot of money and live in Silicon Valley without being a successful founder/VC. Apple farmers… not so much. If you live in Tokyo, then FAANG companies might be less relevant.
Either way, doesn’t seem like the interview is where you should draw the line.
AWS white papers and engineering blogs tend to give me everything I need in one place, and I don't think there are any for apps built with NOTIFY.
This is what kills you if you're a small startup. Of course it gives you a lot too. But if you're belly up then it doesn't matter.
Of course go for whatever solution gives you the most benefits while not distracting you too much from your main goal.
I've seen a startup where devs spent around 80% of the time fighting their tools and infrastructure. They had a 3 month runway and today there's a massive hole at the end of that runway. I still shudder form just the thought of it.
For instance, if you use postgres with a low load, it is almost trivial to migrate schemas, add new constraints, do analytics etc.
If you use SQS, Cassandra, whatever, then you now get scalability/availability but it becomes much more time-consuming to change things if you figure out that your original design doesn't work. Say the business comes and says "please add constraint X. All users of type foo must never combined value bar at the same time."
It is possible to implemented that without postgres, but it is not easy or simple, especially if you need to make changes.
Therefore, my take is that you either use postgres to stay flexible or you use both postgres and something else on top of it when you know that you won't have to change things. Of course this means additional infrastructure/maintenance overhead.
In the end it's always a trade-off, you just need to know when to trade which thing off against what.
There's a (sort of) objective trade-off to be made, but another dimension is how familiar you are with the solution and/or how quick can you implement it using documentation and examples.
If you happen to know exactly how to create a horizontally scalable microservice based hairball with nodejs, then maybe you are quicker with that than with some traditional django monolith using a nicely normalized sql database (or whatever).
In a startup, you are often always squeezed for time, so making the objectively right tradeoff for your context is usually secondary to the simple question of 'when can you ship?' If the scalable-yet-inflexible is what stack overflow abundantly recommends and documents, maybe this is quicker to get done now, whatever the consequences are on the longer run.
This is a valid comment. I’ve chosen Postgres in the past for the features, not the performance. For example guaranteed at most once delivery (via row locks) and filtering of jobs based on attributes (it’s a database after all).
With your mentioned list, three of them are Python, so that significantly reduces the breadth.
It's even worse than you say though. As someone who has used neither Postgres or Redis for queueing, how am I supposed to know what is the "simple" solution here and if it really solves my problem?
Almost everyone uses solution X. A few people are saying "no, just use solution Y, it's obviously enough and far simpler". Even if it is far simpler, how am I supposed to know whether there are some hidden gotchas here?
Much safer to bet on technology that is proven to work, given that large amounts of people are using it in production for this purpose.
If someone says "I choose X over Y because I used X before (or because X is better documented" then fair enough - but I rarely hear that as an argument when choosing "FAANG-technology".
Author here. I would say that my post is less targeted at someone like you (application developer, presumably) and more targeted at library developers.
I don't think it's ideal for everyone to be implementing bespoke, Postgres-backend (or any other queue for that matter) background job workers in their applications. There's a lot of nuance and implementation details to get wrong with background jobs, and for that reason I think background work should generally be done by more comprehensive, dedicated libraries or frameworks.
If every Rails application didn't have Sidekiq/Active Jobs and instead had bespoke background worker implementations, Rails applications would likely have a much less rosy reputation on account of their unreliability.
but I can get set up with redis, SQS, any of the
'scalable' solutions within a few hours of copy-pasting
commands and code and configuration from a reputable
source
[...] and that makes the engineer's choice to follow faang
look a lot more reasonable.
I also agree with the linked article's overall point, but I think the specific "job queue" example from the article is actually a bad example because:- "rolling your own" job queue is not rocket science but is nontrivial and easy to get wrong w.r.t. locking etc.
- the argument against taking additional dependencies is that now you have one more tool to master, understand, and manage. but my experience is that job queues like Sidekiq are not a significant overhead in terms of developer burden.
And that’s terrible advice, of course. You very likely will have time to scale things up (customer count almost never increase dramatically from one day to the next), and even if you don’t you’ll most likely never deliver a useable product if all components need to be “scalable” from the beginning.
I love the article's point, and I tend to feel that the "chasing the cargo cult of 'scale'" is maybe the biggest problem I see in development teams today. It is certainly the biggest problem that I rarely hear anybody talking about.
Author here. I would say that my post is less
targeted at someone like you (application developer,
presumably) and more targeted at library developers.
I think the article might benefit from clarification on this point.Reading the HN comments, I see that I'm not the only person who came away with a misunderstanding there.
Again, I 100% love the overall point.
> If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it.
I'm not sure about this either, though from reading typical developer blogs and listening to the hivemind, you do get the feeling that you must be scalable. Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).
So yeah, I find a lot of the more complicated solutions to be simple, but mostly because it's well supported and not by just me.
But my response would then be that this is a stupid example in the context of this whole submission because that submission talks about postgres and trying to get postgres to scale "infinitely" let alone fulfill other properties like extremely high uptime etc. that is just... insane. No one in their right mind tries to do that with postgres. It is one thing to do queueing with it but "infinitely scalable" is a totally different one.
Therefore I can only say: yeah, to set up "an infinitely scalable queue-based architecture" you should not use postgres and the author in the submission says the same thing.
> Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).
Right, they make the wrong trade-offs. That is exactly what I wanted to express with my response.
> There’s a good chance that you’re already using a relational database, and if that relational database is Postgres, you should consider it for queues before any other software
The point is, if you are already using postgres, then the question is not: should I use postgres for queueing and the rest or should I use postgres for the rest and a FAANG solution for queueing on top of it.
Now the thing is that the FAANG solutions are great in certain ways and allow you to scale a lot and have extremely high availability. But it comes at the cost, for examply those solutions don't support transactions like postgres does. So if you need those (and often you don't know in advance how the business of a startup develops) then now you have to build some technical solution on top of the FAANG solution which is much much slower and more complicated compared to doing it in postgres.
Even if you say that it's more difficult to setup and understand the queueing in postgres (and I agree), I would argue that in the end it is still faster because you don't need to setup and maintain all the infrastructure (yeah, even if it runs in the cloud) unless this is a prototyp and you don't care about security, documentation and all of that and throw it away in the end anyways.
Your argument is that going with FAANG level designs saves time?
And the crux of your argument is that you're able to find a guide online?
I strongly suspect you don't have a healthy respect for complexity.
Look up the features of Core if interested. No ORM needed, as it says in the docs.
For a starter "the most scalable component is always the most difficult to integrate and use" isn't true, and "whatever your team knows or don't know, the challenges tied to integrating then exploiting a given component are always the same". There are many parameters. In some contexts taking into account the team's subjective preferences is crucial.
There is no universal rule, à la "always go for the most scalable, neglecting any other consideration" or "the minimal immediate effort is always the best option".
It's not a small thing and it's not something you should be dismissing out of hand.