Yeah, table bloat and transaction ID wraparounds are terrible, but easily avoidable if you follow a few simple guidelines. Typically in my experience, best way to avoid these issues are to set sensible vacuum settings and track long running queries.
I do hate the some of the defaults in the Postgres configuration are too conservative for most workloads.
(pdf) https://www.redshiftresearchproject.org/white_papers/downloa...
(html) https://www.redshiftresearchproject.org/white_papers/downloa...
I've been told, very kindly, by a couple of people that it's the best explanation they've ever seen. I'd like to get more eyes on it, to pick up any mistakes, and it might be useful in and of itself anyway to reader, as MVCC on Redshift is I believe the same as MVCC was on Postgres before snapshot isolation.
> In the 2000s, the conventional wisdom selected MySQL because rising tech stars like Google and Facebook were using it. Then in the 2010s, it was MongoDB because non-durable writes made it “webscale“. In the last five years, PostgreSQL has become the Internet’s darling DBMS. And for good reasons!
Different DB's, different strengths and it's not a zero sum came as implied. MySQL was popular before Google was born - we used it heavily at eToys in the 90s for massive transaction volume and replacing it with Oracle was one of the reasons for the catastrophic failure of eToys circa 2001. MongoDB gained traction not because it's an alternative to MySQL or PostgreSQL. And PostgreSQL's marketshare today is on a par with Mongo and both are dwarfed by MySQL which IMO is the true darling of web DB's given it's global popularity.
it is also black magic to tune them.
https://aws.amazon.com/about-aws/whats-new/2022/05/amazon-re...
For concurrency scalability, AWS now configures SNAPSHOT ISOLATION by default if you use Redshift Serverless but non-serverless still defaults to SERIALIZABLE ISOLATION.
https://aws.amazon.com/blogs/database/manage-long-running-re...
I think this is the best video on that topic: https://www.youtube.com/watch?v=b2F-DItXtZs
General remark for startups wanting attention on HN: it's not good to end an interesting article with a call-to-action that makes your article feel like an ad. Readers who read to the end experience that as a bait-and-switch and end up feeling betrayed.
What works much better is to disclose right up front what your startup is and how it's related to the article content. Once you've gotten that out of the way, the reader can then dive into the (hopefully) interesting content and end the article on a satisfying note.
Btw, I have a set of notes on how to write for HN that I'm working (slowly) on turning into an essay. If anyone wants a copy, email me at hn@ycombinator.com and I'll be happy to send it. It includes the above point and a bunch more.
On managed Postgres (i.e: gcp, aws) you pay for the disk, but when you can't run a VACUUM FULL because it locks the table, you end up with a lot of allocated storage for nothing and you can't shrink the disk size (at least on gcp). Storage is cheap but still feels like a waste.
Way back I was working on an in-house inventory app written in Visual Basic against SQL Server 2000, I think. That one just put locks on tables. It had the "charming" characteristic of that if you weren't very, very careful with Enterprise Manager, loading a table in the GUI put a lock on it and just keep on holding it until that window was closed.
Then the running app would eventually snag on that lock, maybe keep holding some other lock that something else would snag on, and 5 minutes later I'd hear one of the operators screaming "Nothing is working! I can't take any orders!" from the room next to me.
Though, any other optimistic concurrency control scheme can be better, but PostgreSQL was at the right place at the right time when people started to leave from MySQL.
I had a similar personal experience. In my previous job we used Postgres to implement a task queuing system, and it created a major bottleneck, resulting in tons of concurrency failures and bloat.
And most dangerously, the system failed catastrophically under load. As the load increased, most transactions ended up in concurrent failures, so very little actual work got committed. This increased the amount of outstanding tasks, resulting in even higher rate of concurrent failures.
And this can happen suddenly, one moment the system behaves well, with tasks being processed at a good rate, and the next moment the queue blows up and nothing works.
I re-implemented this system using pessimistic locking, and it turned out to work much better. Even under very high load, the system could still make forward progress.
The downside was having to make sure that no deadlocks can happen.
Oh man, a previous company I worked at had an issue with a hot table (frequent reads + writes) interfering with autovacuum. Many fires over a six month period arose from all of that. I was (luckily) only on an adjacent team, so I don't know the details, other than vacuums taking over 24 hours! I'm sure it could have been prevented, but it seemed horrible to debug
So how does one work around PostgreSQL’s quirks? Well, you can spend an enormous amount of time and effort tuning it yourself. Good luck with that.
On the one hand, this does make the model Postgres uses admirably simple: the WAL is all "REDO," and the heap is all you need to accomplish any kind of read, but at the expense that stuff that normally would be copied off to a sequential UNDO log and then vaporized when the transaction commits and all possible readers have exited remains comingled with everything else in the main database heap, needing to be fished out again by VACUUM for purging and figuring out how to reclaim numerical space for more transactions.
There may be other solutions to this, but it's one unusual quality Postgres has relative to other MVCC databases, many of which sport an UNDO log.
There are downsides to UNDO, however: if a read needs an old copy of the tuple, it needs to fish around in UNDO, all the indices and synchronization need to account for this, and if there's a rollback or crash recovery event (i.e. mass-rollback of all transactions open at the time), everything has to be shuffled back into the main database storage. Hence the memorable initial comment: "Postgres is optimized for rollback."
Last I checked (....a few years ago, so things may have changed,) the theory of autovacuum heuristics may not have changed much since the turn of the millennium, they're probably about due.
Do you mean this one?
> At OtterTune, we see this problem often in our customers’ databases. One PostgreSQL RDS instance had a long-running query caused by stale statistics after bulk insertions. This query blocked the autovacuum from updating the statistics, resulting in more long-running queries. OtterTune’s automated health checks identified the problem, but the administrator still had to kill the query manually and run ANALYZE after bulk insertions. The good news is that the long query’s execution time went from 52 minutes to just 34 seconds.
The problem, as the article states it, is that a "sensible" vacuum setting for one table is a terrible setting for another depending on how large these tables are. On a 100 million tuple table you'd be waiting 'til there there were 20 million garbage tuples before taking action.
This also led to popularity of bigger reselling setups (I don't miss installing cpanel...) and services like Dreamhost.
MySQL in this way gained a virtuous cycle completely unrelated to Google. Hell, most people I know, who dealt with LAMP space for years, never knew Google had anything to do with MySQL (most people that knew about it were... Lispers. Because of who built the first version of Google Ads)
Even Mac OS X Server shipped with MySQL and PHP because of that, in 2001.
We have been working on automatic database optimization using AI/ML for a decade at Carnegie Mellon University [1][2]. This is not a gimmick. Furthermore, as you can see from the many comments here, the problem is not overhyped.
As a reader I’d have appreciated the original. And I’d appreciate a nice HN alternative.
...Along with replication and being joined with the hip to PHP. As to installation, there was a point in time in the early 2000s where you could sudo to root, type 'mysql' and be talking to a live MySQL on most Linux distros that I used. No wonder a lot of people defaulted to it.
I'm all about avoiding premature optimization, and its fine to start with a classic postgres. But please don't cling to that - if you see MVP success and you actually have a reasonable chance of getting to >1mill users (ie, a successful B2C product) please please dont wait to refactor your datastore to a more scalable solution. You will pay dearly if you wait too long. Absolutist advice serves noone well here - it really does depend on what your goals are as a company.
A "darling" is something you want to use, not something that you are using. Many do not want to use MySQL due to Oracle control. Postgres is definitely the darling of the past few years.
I admit that's not precisely how I described it in the GP comment but it never crossed my mind that anyone would care. Commenter objections never fail to surprise!
Edit: I think I was right that it was in their interest as well as all of ours, because earlier the thread was dominated by complaints like this:
https://news.ycombinator.com/item?id=35718321
https://news.ycombinator.com/item?id=35718172
... and after the change, it has been filling up with much more interesting on-topic comments. From my perspective that's a win-win-win, but YMMV.
Yet, every month or two an article about doing exactly this is upvoted to near the top of HN. It can of course work but might hard to replace years later once "barnacles" have grown on it. Every situation is different of course.
Honestly I think it only gained traction because many Node devs refused to learn SQL and the document model is familiar because it's closer to JSON data.
These days Mongo is good but that wasn't the case back 10+ years ago.
sudo apt-get install mysql-server mysql-client
sudo -i mysql
and be logged in as admin into mysql database was indeed a huge reason for defaulting to it.EDIT: Of course, at that time, there was no Ubuntu teaching everyone to sudo all the time, so drop all instances of sudo and add a su - at start ;)
Replication was something you did when you got succesful enough to have it, or were a MSP providing it at premium to others.
Uber's workload at the time, ironically, was not enough to make a postgresql server running moderately decent hardware to fall over if you actually read the manual.
Uber's engineering team will never be able to live this down.
So what MVCC databases do is keep enough versions to cover the oldest running query instead. Now read only queries don't need to hold any locks at all, they just prune the newest version older than the transaction id the query started at.
MVCC is great, but this article does identify some of the puzzling design choices of the Postgres implementation. The index problems are particularly bad, and seemingly avoidable.
Digg.com also had a really influential technical team - hearing about how they did things set a lot of baseline defaults for a lot of people.
Essentially, start at 2000-2001 and more and more people going into running websites for all kinds of reasons (forums, blogs, webshops, etc. often hosted on low end offerings)
Lots of the pain point have been mitigated in the last ten years. It is now as simple as other comparable complex db can go (i.e. Not simple, but you can't find better product)
This doesn’t have anything to do with MVCC. I’m sure PostgreSQL could implement an index format that piggybacks on another index rather than pointing at the physical page directly, without overhauling MVCC.
10+ years of dev work for a few hundred tables worries me a lot. My last conversion was about 20 years of data across a few hundred tables and we did two-way data synchronization across DB products with about 1 month of work, with 2 devs. We kept the sync running for over a year in production because we didn't want to force users over to the new system in a big hurry. We only stopped because the license on the old DB product finally expired and nobody wanted to pay for it anymore.
They were absolute liars of the first water back in the day, absolutely. In the 3.x era there were claims that transactions were only for people who didn't know how to program! You'd struggle to find most of the absolute nonsense that was being pushed, because it's mostly gone down various memory holes, but it was absolutely breathtaking.
Disagree. It gained traction because it was an alternative to MySQL in the ways that mattered - fast, easy to administer, widely known, good enough. Yes, there are significant differences in the details of what they do - but in terms of someone looking for a backing datastore for their webapp, they're actually competing in a very similar space.
Snapshot isolation isn't as robust and straightforward as strict serializability, but it also isn't as performant as READ COMMITTED. It seems like the worst of both worlds.
> A better approach is to use an AI-powered service automatically determine the best way to optimize PostgreSQL. This is what OtterTune does. We’ll cover more about what we can do in our next article. Or you can sign-up for a free trial and try it yourself.
That was removed after the article was posted to HN, at dang's suggestion - he posted about it elsewhere in these comments.
But I can't find the Ottertune Github page
Is any part of Ottertune open source?
https://www.reddit.com/r/programming/comments/4vms8x/why_we_...
https://www.postgresql.org/message-id/5797D5A1.5030009%40agl...
1. Since MySQL keeps delta to save storage costs, wouldn't read and writes slower because now I have to build the full version from the delta
2. On secondary indexes, they highlight the reads will be slower and also say:
> Now this may make secondary index reads slower since the DBMS has to resolve a logical identifier, but these DBMS have other advantages in their MVCC implementation to reduce overhead.
What are the other advantages they have to make reads faster?
Compared to MySQL, I remember reading that Postgres MVCC lets you alter the table without locking. Now I found out that MySQL also does not require locks. So, how are they doing?
Are there any similar posts which explain MySQL MVCC architecture?
I think Mongo became popular because it's ad tech and those guys knew how to be buzzword compliant. JSON-esque documents are one thing, but Mongo is Javascript to the core. All of a sudden your JS devs don't have to learn SQL they can just shit out some queries in javascript. Of course that came with some pretty severe drawbacks.
Nevermind the other aspect the pat advice doesn’t mention - managing a massive single RDMS is a goddamn nightmare. At a very large scale they are fragile, temperamental beasts. Backups, restores, upgrades all become hard. Migrations become a dark art , often taking down the db despite your best understanding. Errant queries stalling the whole server, tiny subtleties in index semantics doing the same. Yes it’s all solvable with a lot of skill, but it ain’t a free lunch that’s for sure. And tends to become a HUGE drag on innovation, as any change to the db becomes risky.
To your other point yes, replicating data “like for like” into another RDBMS can be cheap. But in my experience this domain data extraction is often taken as an opportunity to move it onto a non RDBMS data store that gives you specific advantages that match that domain, so you don’t have scaling problems again. That takes significantly longer. But yes I am perhaps unfairly including all the domain separation and “datastore flavor change” work in those numbers
did google even use mysql? certainly if they did they never talked about it publicly in the early 02000s, and of course facebook didn't even exist then
lj, though, they used the fuck out of mysql
/. originally didn't use a database; i (an ordinary user) accidentally posted an article by trying to post a comment on an article that didn't exist yet; i guess they got appended to the same file. but when it did switch to a database (i don't know, about the time google was founded?) it was of course mysql
Maybe that's why I am used to logging in as root rather than a user. I started in 1999 and have been surprised how few users now do
$ ssh user@server <rsa key> $ sudo -i <user pass> #
I think this kind of anticipation was part of Pinterest's early success, for example. They got ahead of their database scaling early and were able to focus on the product and UX.
We first used Mongo ~11 years ago with Java. For us the benefit was that we could dump unstructured data into it quickly, but still run queries / aggregations on it later.
As for MySQL and locks, the original MyISAM table format used locks, but InnoDB tables are MVCC like pgsql.
> The Uber guy is right that InnoDB handles this better as long as you don't touch the primary key (primary key updates in InnoDB are really bad).
> This is a common problem case we don't have an answer for yet.
It's still not how I remember it.
Quote from https://www.postgresql.org/message-id/flat/579795DF.10502%40...
I still prefer Postgres by a long way as a developer experience, for the sophistication of the SQL you can write and the smarts in the optimizer. And I'd still pick MySQL for an app which expects to grow to huge quantities of data, because of the path to Vitesse.
https://www.postgresql.org/docs/current/runtime-config-resou... tells you what all the parameters do, but not why and how to change them.
"If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system." Why not set it to 25% of the memory in my system by default, then?
"Sets the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files. If this value is specified without units, it is taken as kilobytes. The default value is four megabytes (4MB)." Yes, and? Should I set it higher? When?
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv... hasn't been updated for two years and explains only a handful of parameters.
"If you do a lot of complex sorts, and have a lot of memory, then increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents." How much is a lot? Do I need to care if I'm running mostly OLTP queries?
"This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory." Okay, so I need to set it higher if I'm running OLAP queries. But how high is too high?
https://wiki.postgresql.org/wiki/Performance_Optimization is just a collection of blog posts written by random (probably smart) people that may or may not be outdated.
So when someone complains their Postgres instance runs like ass and smug Postgres weenies tell them to git gud at tuning, they should be less smug, because if your RDBMS requires extensive configuration to support nontrivial loads, you either make this configuration the default one or, if it's significantly different for different load profiles, put a whole section in the manual that covers day 1 and day 2 operations.
1) too many people don't understand why they probably should use snapshot isolation (or stricter) and don't understand what guarantees they don't get when read committed is used
2) it's not the default, defaults matter, a lot
3) it makes transactions spurious fallible when the db can't make sure that committing two parallel write transactions won't brake the consistency guarantees, a lot of frameworks don't have the right tools to handle this, people don't expect it, it can make in the transaction interleaved interactions with other systems harder, etc.
(as a side not I assumed you meant REPEATABLE READ when you said snapshot isolation, as it's the least strict isolation level which uses snapshot isolation)
I recall doing an evaluation of open source databases in 2001. MySQL didn't even have row-level locking, let alone any concept of transactions. I summarised it as "easy to use; but only for data you don't care about".
* Not that Postgres (as it was then) was without warts in 2001. A huge one was its "object orientation": table inheritance. What it needed then, and would still be nice to have, is object orientation at data type (column) level, an extension of the SQL domain.
Edit: I should clarify that I recognize the need for one extra version, since read-committed txns shouldn't see it until it is committed. Other writes must wait for the commit until they can write, though - it seems like there's some optimistic-writing thing where we let a bunch of writes queue up for one record knowing that we're going to have to a problem when one of them commits and the others find out they should have waited before trying to write or something, because we didn't force them to acquire a write lock before writing.
I'd wish PostgreSQL would have as simple when it comes to replication and failover like MySQL does. It's always a pain when switching masters back and forth.
Would love to hear a from-the-trenches summary of that.
eg if it looks key-value ish, or key + timestamp (eg user transaction table), dynamodb is incredible. Scales forever, never have to think about operations. But not generally queryable like pg.
if it looks event-ish or log-ish, offload to a redshift/snowflake/bigtable. But append only & eventually consistent.
if you really need distributed global mutations, and are willing to pay with latency, spanner is great.
if you can cleanly tenent or shard your data and theres little-to-no cross-shard querying then vitess or some other RDBMS shard automation layer can work.
There are a few "postgres but distributed" dbs maturing now, like cockroach - i havent personally used them at a scale that i could tell you if it actually works or not though. AFAIU these systems still have tradeoffs around table layout and access patterns that you have to think about.
I remember people stating that transactions are useless, and maybe they are for some workloads, see the success of MongoDB years later.
The transactional engine InnoDB was added in version 3.23 [1] in 2001 [2] .
[1] https://en.wikipedia.org/wiki/Comparison_of_MySQL_database_e...
That the defaults don't handle top users is hardly an issue.
what is delta undo log?
Later on, VACUUM has to plow through everything and check the oldest running transaction to see whether the tuple can be "frozen" (old enough to be seen by every transaction, and not yet deleted) or the space reclaimed as usable (deleted and visible to nothing). Index tuples likewise must be pruned at this time.
In systems with an UNDO log, the tuple is mutated in place and the contents of the old version placed into a sequential structure. In the case where the transaction commits, and no existing concurrent repeatable read level transactions exist, the old version in the sequential structure can be freed, rather than forcing the system to fish around doing garbage collection at some later time to obsolete the data. This could be considered "optimized for commit" instead of the memorable "optimized for rollback."
On the read side, however, you need special code to fish around in UNDO (since the copy in the heap is uncommitted data at least momentarily) and ROLLBACK needs to apply the UNDO material back to the heap. Postgres gets to avoid all that, at the cost of VACUUM.
[1] The exception is "HOT" (heap only tuple) chains, which if you squint look a tiny bit UNDO-y. https://www.cybertec-postgresql.com/en/hot-updates-in-postgr...
Of course if you're running reduced consistency things are easier. Wrong answers are often faster. But when people select a transactional database, it's usually because they at least need snapshot consistency, and often they require full serializability.
There are multiple years available for his first DB class but that’s the one I watched. I almost called it his ‘basic’ class but there’s literally only like one or two classes on SQL before he dives into all the various layers of the internals.
There’s also a few of his advanced courses. And then you’ll see guest lectures from industry on about every one of the new DB platforms you can think of.
They’re all under “CMU Database Group” on Youtube.
Highly recommend.