zlacker

[parent] [thread] 18 comments
1. skytre+(OP)[view] [source] 2021-06-12 10:41:18
This please. I feel like "How to Get Away with Just PostgreSQL" and the GP comment falls squarely under being too preoccupied with whether you could, you didn't stop to think if you should.

Whatever happened to use the proper data structures for the job? PostgreSQL and MySQL are, at the end of the day, b-trees with indices. Throw in relational properties and/or ACID too. Those aren't properties you need or want in a queue structure.

I know I don't have a solid argument against not doing it; it's just experience (and dare I say, common sense) telling me not to. Not quite like parent but I spent the first two years of my professional career in a team that had the brilliant idea to use DBs as queues. The big task I partook in for that stint is moving them off that v2 into a v3 which used---wait for it---Redis. Everyone's quality of life improved with every migration, proportional to the size of the v2 cluster we retired.

replies(4): >>zigzag+c4 >>lolind+Ll >>Seattl+PB >>e12e+tx1
2. zigzag+c4[view] [source] 2021-06-12 11:30:16
>>skytre+(OP)
> Whatever happened to use the proper data structures for the job?

This so much. People too often treat databases as magical black-boxes that should handle anything. Database is most often the bottleneck and choosing the proper storage engine with appropriate data structures can be 100x more efficient that just using the defaults. 1 server vs 100 can definitely make a noticeable difference in costs and system complexity.

While premature optimization is bad, choosing the right tool for the job is still somewhat important and will usually pay off in the long run.

replies(1): >>mirekr+m9
◧◩
3. mirekr+m9[view] [source] [discussion] 2021-06-12 12:35:42
>>zigzag+c4
I think your "most often" is more like 0.01%, I'd say the inverse is true, that _most_ would be fine with single sqlite host or something like rqlite.
replies(2): >>otoole+Hc >>zigzag+5e
◧◩◪
4. otoole+Hc[view] [source] [discussion] 2021-06-12 13:10:30
>>mirekr+m9
rqlite author here. Happy to answer any questions about it.

https://github.com/rqlite/rqlite

replies(1): >>mirekr+vh
◧◩◪
5. zigzag+5e[view] [source] [discussion] 2021-06-12 13:25:35
>>mirekr+m9
What would you then consider to be the most common bottleneck?

I agree that there are many cases with low workload where that would be plenty.

replies(2): >>mirekr+Dg >>solips+4B
◧◩◪◨
6. mirekr+Dg[view] [source] [discussion] 2021-06-12 13:50:31
>>zigzag+5e
Most common bottleneck is lack of competence.

Direct visible effects are wrong decisions entangled in spaghetti-like complexity.

It's hard to reach technical bottleneck in well designed systems. Computers are really fast novadays. They will vary greatly depending on what kind of system it is. Out of resources – cpu, memory, network, disk io – likely the weakest of them will be saturated first – network. But that's not a rule, it's easy to have system which will saturate ie. CPU before.

replies(2): >>hughrr+wh >>zigzag+Rm
◧◩◪◨
7. mirekr+vh[view] [source] [discussion] 2021-06-12 13:57:32
>>otoole+Hc
Are you planning on adding websockets or something similar in the near future to support things like ie. data change notifications [0]?

[0] https://www.sqlite.org/c3ref/update_hook.html

◧◩◪◨⬒
8. hughrr+wh[view] [source] [discussion] 2021-06-12 13:57:38
>>mirekr+Dg
This is the best post in this thread.

A lot of people don't see the effects of their decisions. They leave a company after 3-5 years and go and work somewhere else where they get to make the same mistake again. The bottleneck indeed is lack of competence.

As for technical bottlenecks, it's quite easy to hit a wall. Be it through layers of stupid or unexpected success. We have unexpectedly reached the limit of what is possible with x86-64 on a couple of occasions due to stupid decisions made over 10 years previously for which there is now no longer the budget or attention to fix.

9. lolind+Ll[view] [source] 2021-06-12 14:39:31
>>skytre+(OP)
What has me wanting to stick with postgres is that I work on a small team (two developers) and adding more technologies to our stack is extra overhead that's hard to justify. At our peak we're currently handling one request per second, and postgres for a queue is more than sufficient for that. Is there any good reason for us to add, learn, and maintain a technology neither of us yet knows? Or would we do just as well to abstract away the queue in the code so that we can switch to redis when we do run into scaling problems?
replies(1): >>skytre+fs
◧◩◪◨⬒
10. zigzag+Rm[view] [source] [discussion] 2021-06-12 14:49:44
>>mirekr+Dg
Competence is expensive :) While I mostly agree, even well designed systems have (sometimes considerable) tradeoffs.

> It's hard to reach technical bottleneck in well designed systems. Computers are really fast novadays.

I have been listening to how fast moderen computers are for the better part of the past two decades, yet as I user I still have to deal daily with too many of slow software and slow web services.

replies(1): >>mirekr+3z
◧◩
11. skytre+fs[view] [source] [discussion] 2021-06-12 15:43:10
>>lolind+Ll
> Is there any good reason for us to add, learn, and maintain a technology neither of us yet knows?

Absolutely and that reason is, you are still a small team, with a small user base to boot. That's fantastic opportunity to learn a new technology and build on it properly! Remember everything is easier in software engineering if you assume you have no users[1] and your situation is as close as it gets to this ideal. Leverage it.

Plus, as me and others keep saying, Redis (and other proper queues) isn't a complex addition to your infra. This isn't Hadoop, or Kafka, which is a very special type of queue (one way to put it, at least).

> one request per second, and postgres for a queue is more than sufficient for that

Yes I agree but...

> Or would we do just as well to abstract away the queue in the code so that we can switch to redis when we do run into scaling problems?

What I read when I see such statements is this mythical software engineering ideal that with enough abstraction, a migration is just a matter of writing a new class that implements some interface and then changing a config. For a sufficiently complex app infra, that happens almost never because you could never keep the abstraction leaks to an acceptable level.

Another thing, abstraction does not solve all your problems if the underlying implementation is poor fit to begin with. Let me paint you a plausible scenario:

Once you are large enough, you might find your PGQ acting weird and you realize it's because someone in the team wrote code that accesses your queue table like it's an actual table of records, not a queue. So you think, okay let's prevent that from happening. Maybe you add users and permissions to distinguish connections that need to access between tables proper and queue. Maybe you start writing stored procs to check and enforce queue invariants periodically.

Well, guess what, all those problems would've been solved for free if you invested maybe one work day getting a Redis server running when you were a two-person op serving one request per second.

Lastly, scaling a relational DB is an entirely different beast from scaling a queue. Scaling anything does not ever come painless but you can reduce the suffering when it comes. Would you rather scale PG so it can keep acting as a queue or scale a queue that's, you know, really a queue in the first place? Heck the latter might even be solvable by throwing money at the problem (i.e., give it more compute).

[1] Except for the part where you need to make money, of course.

replies(1): >>yongji+cW
◧◩◪◨⬒⬓
12. mirekr+3z[view] [source] [discussion] 2021-06-12 16:45:54
>>zigzag+Rm
Somebody once said "cheap things are expensive". This idea applies to developers as well. Cheap developers will drive company through bumpy roads towards uninteresting plains. Good developers not only pay for themselves but bring orders of magnitude more cash in. Only thing that touches on this that I can find is "software craftsmanship manifesto".
replies(1): >>zigzag+ZL
◧◩◪◨
13. solips+4B[view] [source] [discussion] 2021-06-12 16:58:56
>>zigzag+5e
Money. The most common bottleneck is money and customers. Use whatever helps you get new customers faster.

Don't be scared of having to make changes in the future. Do the small amount of work it takes today to make sure your transition in the future is easy.

Transitioning from a SQL queue to redis it's only difficult if you have a bunch of SQL throughout your code. If you have that, you did it wrong.

14. Seattl+PB[view] [source] 2021-06-12 17:04:20
>>skytre+(OP)
Your comment sort of explains why you would use your DB as a queue. It is a big task to migrate to a new system. If you already have Ppostgres or MySQL integrated and deployed. Using it as a queue may be the simplest option.
◧◩◪◨⬒⬓⬔
15. zigzag+ZL[view] [source] [discussion] 2021-06-12 18:22:50
>>mirekr+3z
True
◧◩◪
16. yongji+cW[view] [source] [discussion] 2021-06-12 19:58:47
>>skytre+fs
> Absolutely and that reason is, you are still a small team, with a small user base to boot. That's fantastic opportunity to learn a new technology and build on it properly! Remember everything is easier in software engineering if you assume you have no users[1] and your situation is as close as it gets to this ideal. Leverage it.

I have to disagree. Of course code quality is important, but building things "properly" because "we may need it later" is a great way to kill a project with complexity. KISS, YAGNI. An early startup is, IMHO, not a good place to learn about new frameworks while getting paid - you're on borrowed time.

Make a back-of-the-envelope calculation about how much throughput you need. E.g., if you expect to have 10,000 users, and each may make one request per hour, you're dealing with 3 qps. Anybody who wants to bring in a new dependency for this, needs some talking to.

(If you already need Redis anyway and it's a better fit than Postgresql, then sure, go ahead.)

replies(1): >>skytre+511
◧◩◪◨
17. skytre+511[view] [source] [discussion] 2021-06-12 20:54:09
>>yongji+cW
> but building things "properly" because "we may need it later" is a great way to kill a project with complexity

Emphasis added because I feel like I addressed this in the paragraph immediately after the one you quoted:

> Plus, as me and others keep saying, Redis (and other proper queues) isn't a complex addition to your infra

I'm speaking out of experience and, as I already pointed out in another subthread, Postgres is far more complex than Redis. Consider the presence of "DB Admins/Specialists" and the lack of counterpart thereof for Redis and other queuing solutions.

Of course, if queues are not central to how your platform operates, you might be able to get away with Postgres. I still advise using Redis as a reasonable hedge against someone famous tweeting organically about your service because in this case, you don't want your DB to go down because some queue table had a surplus of transactions (or vice versa).

Not to mention, at an early stage, your tech decisions set precedents for the team. Maybe you have 10K users with a low qps but soon you are sending marketing emails to them and your system has periodic bursts of queue activity for all 10K users at once. When discussing this marketing "feature" rarely anyone thinks, "Hey we can't do that with our Postgres queue", rather "Yeah I saw functions in our codebase for queuing---this is doable". This is a small effort but a huge technical investment for later on.

replies(1): >>rkk3+cI1
18. e12e+tx1[view] [source] 2021-06-13 02:56:22
>>skytre+(OP)
> Whatever happened to use the proper data structures for the job? PostgreSQL and MySQL are, at the end of the day, b-trees with indices. Throw in relational properties and/or ACID too. Those aren't properties you need or want in a queue structure.

How is relational storage engine with support for transactions, document/json, ACID, hot-standby "the wrong data model" for a queue?

◧◩◪◨⬒
19. rkk3+cI1[view] [source] [discussion] 2021-06-13 05:34:20
>>skytre+511
> I still advise using Redis as a reasonable hedge against someone famous tweeting

Early stage startups die because of lack of PMF. Diverting focus and resources away from finding PMF kills companies. Most companies should focus on the product, tech debt be damned.

[go to top]