zlacker

Yup. A lot of heavy users of Postgres eventually hit the same barrier. Here's another take from Uber: https://www.uber.com/blog/postgres-to-mysql-migration/

I had a similar personal experience. In my previous job we used Postgres to implement a task queuing system, and it created a major bottleneck, resulting in tons of concurrency failures and bloat.

And most dangerously, the system failed catastrophically under load. As the load increased, most transactions ended up in concurrent failures, so very little actual work got committed. This increased the amount of outstanding tasks, resulting in even higher rate of concurrent failures.

And this can happen suddenly, one moment the system behaves well, with tasks being processed at a good rate, and the next moment the queue blows up and nothing works.

I re-implemented this system using pessimistic locking, and it turned out to work much better. Even under very high load, the system could still make forward progress.

The downside was having to make sure that no deadlocks can happen.

replies(2): >>osigur+Xq >>Diablo+Xw

>>cybera+(OP)
>> In my previous job we used Postgres to implement a task queuing system, and it created a major bottleneck, resulting in tons of concurrency failures and bloat

Yet, every month or two an article about doing exactly this is upvoted to near the top of HN. It can of course work but might hard to replace years later once "barnacles" have grown on it. Every situation is different of course.

replies(2): >>guywho+Rt >>johnth+Vx4

>>osigur+Xq
I've built this queue system probably 5 times, first 2-3 were failures as consumer concurrency was 1 without us noticing for hours. The bloat comes from updating the work instead of deleting I assume, did for me. There are definitely many ways to not do it right but kinda works.

replies(1): >>osigur+8w

>>guywho+Rt
Yeah, it is strange that hokey, home grown solutions built on top of Postgres are suddenly in vogue.

>>cybera+(OP)
I remember when Uber got roasted by the postgresql mailing list over this: ultimately, a post mortem was done on all of Uber's claims, and it was basically proven that they were incompetent, did not read any available "best practices" guides, did not seek any external help, and treated it like it was some sort of mysql-esque database and used it as wrong as humanly possible.

Uber's workload at the time, ironically, was not enough to make a postgresql server running moderately decent hardware to fall over if you actually read the manual.

Uber's engineering team will never be able to live this down.

replies(6): >>klysm+ME >>endorp+1G >>emmela+gH >>willia+N81 >>barrke+qk1 >>orthox+sl1

>>Diablo+Xw
Migrating entire workload is way more fun and exciting than reading manual. How else are you going to demonstrate your impact!

>>Diablo+Xw
Anyone has a link to that mailing list thread to share?

replies(1): >>ncann+BR

>>Diablo+Xw
Do you have a link to the mailing list discussion?

>>endorp+1G
From what I can Google it seems to be the opposite of that, where they acknowledged Postgres's shortcoming in the mailing list:

https://www.reddit.com/r/programming/comments/4vms8x/why_we_...

https://www.postgresql.org/message-id/5797D5A1.5030009%40agl...

replies(1): >>fathyb+8f1

>>Diablo+Xw
I’ve never heard of this, it sounds fun but I won’t take it at face value without a source

>>ncann+BR
I didn't look at the reddit link but the full mailing list thread is more nuanced than that: https://www.postgresql.org/message-id/flat/579795DF.10502%40...

>>Diablo+Xw
That's not how I remember it.

> The Uber guy is right that InnoDB handles this better as long as you don't touch the primary key (primary key updates in InnoDB are really bad).

> This is a common problem case we don't have an answer for yet.

It's still not how I remember it.

Quote from https://www.postgresql.org/message-id/flat/579795DF.10502%40...

I still prefer Postgres by a long way as a developer experience, for the sophistication of the SQL you can write and the smarts in the optimizer. And I'd still pick MySQL for an app which expects to grow to huge quantities of data, because of the path to Vitesse.

>>Diablo+Xw
That's still the PostgreSQL problem: it has insane defaults.

https://www.postgresql.org/docs/current/runtime-config-resou... tells you what all the parameters do, but not why and how to change them.

"If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system." Why not set it to 25% of the memory in my system by default, then?

"Sets the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files. If this value is specified without units, it is taken as kilobytes. The default value is four megabytes (4MB)." Yes, and? Should I set it higher? When?

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv... hasn't been updated for two years and explains only a handful of parameters.

"If you do a lot of complex sorts, and have a lot of memory, then increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents." How much is a lot? Do I need to care if I'm running mostly OLTP queries?

"This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory." Okay, so I need to set it higher if I'm running OLAP queries. But how high is too high?

https://wiki.postgresql.org/wiki/Performance_Optimization is just a collection of blog posts written by random (probably smart) people that may or may not be outdated.

So when someone complains their Postgres instance runs like ass and smug Postgres weenies tell them to git gud at tuning, they should be less smug, because if your RDBMS requires extensive configuration to support nontrivial loads, you either make this configuration the default one or, if it's significantly different for different load profiles, put a whole section in the manual that covers day 1 and day 2 operations.

replies(2): >>klysm+cx3 >>mixmas+Ix3

>>orthox+sl1
Defaults are hard to change because it makes upgrading even scarier

>>orthox+sl1
If you're running pg at scale, it pays to have a least one person familiar with the config file.

That the defaults don't handle top users is hardly an issue.

>>osigur+Xq
skip locks are the secret for queues in pg. did you use skip locks?