zlacker

I often see the "engineers copy FAANG infrastructure because they want to be cool, even though their needs are completely different" take as a kind of attack on engineers.

But I think a lot of it is also about knowledge and documentation. If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it. Yes maintenance is higher, but I can get set up with redis, SQS, any of the 'scalable' solutions within a few hours of copy-pasting commands and code and configuration from a reputable source.

If I want to use NOTIFY in postgres? I googled "SQLALchemy notify listen postgres" and I find a few unanswered stackoverflow questions and a github gist that has some code but no context.

I would honestly love to use this approach for a side project, but I don't have 2-3 days to figure it out on my own. The direct choice for me might seem to be

* simple, but not scalable (ie just use postgres)

* complex, but scalable (ie redis, sqs, whatever)

and then it's a tradeoff, and the argument goes that I am blinded by cool tech and FAANG and I'm choosing complex but scalable, even though I don't need scalable.

But taking into account guides and other resources, the choice for me is actually

* complex and not scalable (this, because I don't know how to implement it and I can't predict what pitfalls I might face if I try)

* simple and scalable (what everyone actually does)

and that makes the engineer's choice to follow faang look a lot more reasonable.

replies(8): >>natmak+V1 >>zozbot+N5 >>rantin+ea >>valent+Ve >>edanm+sl >>acaloi+zJ >>JohnBo+jX >>PH95Vu+6y1

>>ritzac+(OP)
Another point is: you don't need scalable now, but may (or even hope) to need it later, and you know that when you will need it you probably won't have time to invest into migrating this component.

Also: you may think that you may one day want to be hired by a FAANG.

replies(5): >>mkl95+u5 >>tmpX7d+1a >>runeks+001 >>evantb+v61 >>_jal+Na1

>>natmak+V1
How relevant is it to be hired by a FAANG? I have some experience with "web scale" systems, but I tend to reject FAANG recruiters because Leetcode makes me want to become an apple farmer (no pun).

replies(1): >>vineya+u7

>>ritzac+(OP)
> If I want to use NOTIFY in postgres?

The nice thing about "boring" tech like Postgres is that it has great documentation. So just peruse https://www.postgresql.org/docs/current/sql-notify.html . No need for google-fu.

replies(1): >>ritzac+h9

>>mkl95+u5
> How relevant is it to be hired by a FAANG?

If you want a job there, very relevant.

> I tend to reject FAANG recruiters because Leetcode

I understand the pain of leetcode interviews. They’re terrible. But optimizing your career based on the interview process seems… backwards?

FAANG companies (for example) are very relevant if you want to make a lot of money and live in Silicon Valley without being a successful founder/VC. Apple farmers… not so much. If you live in Tokyo, then FAANG companies might be less relevant.

Either way, doesn’t seem like the interview is where you should draw the line.

replies(1): >>mkl95+1b

>>zozbot+N5
Python, Flask, SQLAlchemy, and Postgres all have great documentation individually, but if I am building an application at the intersection often a guide on exactly how to join them all up is much faster than using each individually and trying to figure out the interactions in four places.

AWS white papers and engineering blogs tend to give me everything I need in one place, and I don't think there are any for apps built with NOTIFY.

replies(3): >>random+Uk >>sgarla+ol >>sppras+uk2

>>natmak+V1
Yeah. Just keep it to side-projects only. Anyone practicing resume-driven development on my team will be (and I’m exaggerating here) shown the door.

replies(1): >>natmak+yM

>>ritzac+(OP)
> Yes maintenance is higher

This is what kills you if you're a small startup. Of course it gives you a lot too. But if you're belly up then it doesn't matter.

Of course go for whatever solution gives you the most benefits while not distracting you too much from your main goal.

I've seen a startup where devs spent around 80% of the time fighting their tools and infrastructure. They had a 3 month runway and today there's a massive hole at the end of that runway. I still shudder form just the thought of it.

replies(1): >>PH95Vu+X75

>>vineya+u7
I guess my career is pretty close to optimal. I get to work on interesting problems from anywhere I want and save a ton of money. If you are an EU candidate, FAANG companies want you to relocate to some city in the UK or Ireland, which would obliterate my savings rate, and are worse places to live than most mainland EU areas. I understand not everyone is as fortunate as me, which increases their motivation to grind Leetcode and the like.

>>ritzac+(OP)
Scalability comes at a price. Unless you need it, it makes you less flexible. And that is exactly what you don't want to be as a startup.

For instance, if you use postgres with a low load, it is almost trivial to migrate schemas, add new constraints, do analytics etc.

If you use SQS, Cassandra, whatever, then you now get scalability/availability but it becomes much more time-consuming to change things if you figure out that your original design doesn't work. Say the business comes and says "please add constraint X. All users of type foo must never combined value bar at the same time."

It is possible to implemented that without postgres, but it is not easy or simple, especially if you need to make changes.

Therefore, my take is that you either use postgres to stay flexible or you use both postgres and something else on top of it when you know that you won't have to change things. Of course this means additional infrastructure/maintenance overhead.

In the end it's always a trade-off, you just need to know when to trade which thing off against what.

replies(2): >>Lutger+ug >>gazpac+Bk

>>valent+Ve
This is all true, important and often misunderstood, but beside the point made to which you reply.

There's a (sort of) objective trade-off to be made, but another dimension is how familiar you are with the solution and/or how quick can you implement it using documentation and examples.

If you happen to know exactly how to create a horizontally scalable microservice based hairball with nodejs, then maybe you are quicker with that than with some traditional django monolith using a nicely normalized sql database (or whatever).

In a startup, you are often always squeezed for time, so making the objectively right tradeoff for your context is usually secondary to the simple question of 'when can you ship?' If the scalable-yet-inflexible is what stack overflow abundantly recommends and documents, maybe this is quicker to get done now, whatever the consequences are on the longer run.

replies(1): >>valent+Gm

>>valent+Ve
> Scalability comes at a price. Unless you need it, it makes you less flexible. And that is exactly what you don't want to be as a startup.

This is a valid comment. I’ve chosen Postgres in the past for the features, not the performance. For example guaranteed at most once delivery (via row locks) and filtering of jobs based on attributes (it’s a database after all).

>>ritzac+h9
If SQLAlchemy’s documentation doesn’t explain its use with LISTEN/NOTIFY, perhaps it’s the wrong tool for the job? You are presumably not going to use it with Redis or SQS queues, so why are you so hung up on it here?

>>ritzac+h9
IMHO - and this probably why I’ll never launch a product - you should understand each piece of your infra. Not necessarily to the metal on each, but I don’t think it’s unreasonable to be able to explain why each piece is necessary, what it’s doing, and how to troubleshoot it when it breaks.

With your mentioned list, three of them are Python, so that significantly reduces the breadth.

>>ritzac+(OP)
That's a great point, that often people misunderstand.

It's even worse than you say though. As someone who has used neither Postgres or Redis for queueing, how am I supposed to know what is the "simple" solution here and if it really solves my problem?

Almost everyone uses solution X. A few people are saying "no, just use solution Y, it's obviously enough and far simpler". Even if it is far simpler, how am I supposed to know whether there are some hidden gotchas here?

Much safer to bet on technology that is proven to work, given that large amounts of people are using it in production for this purpose.

>>Lutger+ug
Then maybe I just don't understand your post. To me it sounds like you say "FAANG-technology" is chosen because of documentation. But I don't think that the documentation of e.g. SQS is better than the postgres (if you can even compare the too).

If someone says "I choose X over Y because I used X before (or because X is better documented" then fair enough - but I rarely hear that as an argument when choosing "FAANG-technology".

replies(2): >>Lutger+2f1 >>cereal+1q1

>>ritzac+(OP)
> If I want to use NOTIFY in postgres? I googled "SQLALchemy notify listen postgres" and I find a few unanswered stackoverflow questions and a github gist that has some code but no context.

Author here. I would say that my post is less targeted at someone like you (application developer, presumably) and more targeted at library developers.

I don't think it's ideal for everyone to be implementing bespoke, Postgres-backend (or any other queue for that matter) background job workers in their applications. There's a lot of nuance and implementation details to get wrong with background jobs, and for that reason I think background work should generally be done by more comprehensive, dedicated libraries or frameworks.

If every Rails application didn't have Sidekiq/Active Jobs and instead had bespoke background worker implementations, Rails applications would likely have a much less rosy reputation on account of their unreliability.

replies(1): >>JohnBo+z11

>>tmpX7d+1a
Decisions are rarely made upon a single criterion, and such a criterion isn't usually formulated explicitly.

>>ritzac+(OP)
I agree with you:

    but I can get set up with redis, SQS, any of the 
    'scalable' solutions within a few hours of copy-pasting 
    commands and code and configuration from a reputable 
    source

    [...] and that makes the engineer's choice to follow faang 
    look a lot more reasonable.

I also agree with the linked article's overall point, but I think the specific "job queue" example from the article is actually a bad example because:

- "rolling your own" job queue is not rocket science but is nontrivial and easy to get wrong w.r.t. locking etc.

- the argument against taking additional dependencies is that now you have one more tool to master, understand, and manage. but my experience is that job queues like Sidekiq are not a significant overhead in terms of developer burden.

>>natmak+V1
If your first point holds, then all app components should be “scalable” from the beginning, because you may not have time to make it so later.

And that’s terrible advice, of course. You very likely will have time to scale things up (customer count almost never increase dramatically from one day to the next), and even if you don’t you’ll most likely never deliver a useable product if all components need to be “scalable” from the beginning.

replies(1): >>natmak+pD3

>>acaloi+zJ
Thank you for writing this.

I love the article's point, and I tend to feel that the "chasing the cargo cult of 'scale'" is maybe the biggest problem I see in development teams today. It is certainly the biggest problem that I rarely hear anybody talking about.

    Author here. I would say that my post is less 
    targeted at someone like you (application developer, 
    presumably) and more targeted at library developers.

I think the article might benefit from clarification on this point.

Reading the HN comments, I see that I'm not the only person who came away with a misunderstanding there.

Again, I 100% love the overall point.

replies(1): >>acaloi+6a2

>>natmak+V1
Hard disagree. Some things are difficult to change later on, others not so much, and you can't do everything for v1. The product has to launch at some point. Your choice of queue is one of the things you'll be able to change. Don't complicate things unless you've run the numbers and know you'll need to. A lot of very large companies do just fine with using relational databases as queues.

>>natmak+V1
Building things you don't need in hopes that you'll need them because things you aren't spending time on will grow to demand them is like hiring an investment manager when you're in debt.

>>valent+Gm
That was exactly what ritzaco said (not me):

> If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it.

I'm not sure about this either, though from reading typical developer blogs and listening to the hivemind, you do get the feeling that you must be scalable. Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).

replies(1): >>valent+ss1

>>valent+Gm
I often find that the tooling I have at work helps speed up the development of more complicated solutions. So saying that FAANG solutions are easy to use and you can be fast at is easy when you have that FAANG support. Even just non-FAANG but large enterprises allow for that, but for startup's is easy to forget how the environment (including tooling) helps speed up all of that work immensely.

So yeah, I find a lot of the more complicated solutions to be simple, but mostly because it's well supported and not by just me.

replies(1): >>valent+uu1

>>Lutger+2f1
Okay, if that is the context then I understand.

But my response would then be that this is a stupid example in the context of this whole submission because that submission talks about postgres and trying to get postgres to scale "infinitely" let alone fulfill other properties like extremely high uptime etc. that is just... insane. No one in their right mind tries to do that with postgres. It is one thing to do queueing with it but "infinitely scalable" is a totally different one.

Therefore I can only say: yeah, to set up "an infinitely scalable queue-based architecture" you should not use postgres and the author in the submission says the same thing.

> Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).

Right, they make the wrong trade-offs. That is exactly what I wanted to express with my response.

>>cereal+1q1
But that is not what we are discussing here. From the submission:

> There’s a good chance that you’re already using a relational database, and if that relational database is Postgres, you should consider it for queues before any other software

The point is, if you are already using postgres, then the question is not: should I use postgres for queueing and the rest or should I use postgres for the rest and a FAANG solution for queueing on top of it.

Now the thing is that the FAANG solutions are great in certain ways and allow you to scale a lot and have extremely high availability. But it comes at the cost, for examply those solutions don't support transactions like postgres does. So if you need those (and often you don't know in advance how the business of a startup develops) then now you have to build some technical solution on top of the FAANG solution which is much much slower and more complicated compared to doing it in postgres.

Even if you say that it's more difficult to setup and understand the queueing in postgres (and I agree), I would argue that in the end it is still faster because you don't need to setup and maintain all the infrastructure (yeah, even if it runs in the cloud) unless this is a prototyp and you don't care about security, documentation and all of that and throw it away in the end anyways.

>>ritzac+(OP)
> and that makes the engineer's choice to follow faang look a lot more reasonable.

Your argument is that going with FAANG level designs saves time?

And the crux of your argument is that you're able to find a guide online?

I strongly suspect you don't have a healthy respect for complexity.

>>JohnBo+z11
Yep, fair critique. Glad you enjoyed it.

>>ritzac+h9
SQLAlchemy is an extra abstraction blocking your path here. While you probably should still use an ORM for your regular relation queries, you are not gaining anything significant by trying to use SQLAlchemy for implementing a queue backend. You can write raw SQL with psycopg2 (which is already a dependency in your project thanks to SQLAlchemy), and wrap these raw queue management SQL in a nice little Python module which you can later reuse for other applications as well.

replies(1): >>mixmas+xK2

>>sppras+uk2
You can write raw SQL with SA, while keeping the other nice features it has.

replies(1): >>hmhmhm+kW2

>>mixmas+xK2
Without being rude, what are the nice features? I've worked with it a bit and constantly found myself wishing it was just SQL whenever I've bumped into it

replies(1): >>mixmas+7b3

>>hmhmhm+kW2
API/engine/connection/pooling abstraction, serialization, type checking, etc.

Look up the features of Core if interested. No ORM needed, as it says in the docs.

>>runeks+001
This is all a matter of balancing constraints. I wrote "you know that when you will need it you probably won't have time to invest into migrating this component.". I didn't wrote "always go for the more scalable".

For a starter "the most scalable component is always the most difficult to integrate and use" isn't true, and "whatever your team knows or don't know, the challenges tied to integrating then exploiting a given component are always the same". There are many parameters. In some contexts taking into account the team's subjective preferences is crucial.

There is no universal rule, à la "always go for the most scalable, neglecting any other consideration" or "the minimal immediate effort is always the best option".

>>rantin+ea
This line stood out to me too as very dismissive of something that can absolutely bring you to a standstill.

It's not a small thing and it's not something you should be dismissing out of hand.