"In Python, most of the bottleneck comes from having to fetch and deserialize Redis data."
This isn't a fair comparison. Of freaking course postgres would be faster if it's not reaching out to another service.ML algorithms get a lot focus and hype. Data retrieval, not as much.
If they wanted it to be a fair comparison they should have used FDWs to connect to the same redis and http server that the python benchmarks tested against.
It's like if I told you to move to a place where you can walk 5mins to work, and you tell me it's not a fair comparison because right now you have to drive to the station and then get on a train and you're interested in a comparison where you walk to the train instead. You don't need the train because you're already there!
You don't need the network hops exactly because the data is already there in the right way.
Don't you think it would be incredibly useful as a baseline if they included a third test with FDWs against redis and the http server?
Calling this Postgres vs Flask is misleading at best. It’s more like “1 tier architecture vs 2 tier architecture”
Remember, this is not plain file serving -- this is actually invoking XGBoost library which does complex mathematical operations. The user does not get data from disk, they get inference results.
Unless you know of any other solution which can invoke XGBoost (or some other inference library), I don't see anything "embarrassingly overkill" there.
In fact, as far as I can tell, postgres is not running as a microservice here. The data still has to be marshalled into some output other services can use.
A suggestion: clean up the blog post's charts and headers to make it much, much more clear that what's being compared isn't python vs postgresml.
https://www.postgresql.org/docs/current/plpython.html
Naturally Rust or C functions will still be faster.