PostgresML is 8-40x faster than Python HTTP microservices

submitted by redbel+(OP) on 2022-10-20 00:45:32 | 116 points 63 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts

>>davewb+u7
It looks like it's a PostgreSQL extension, and probably not one supported by AWS RDS for PostgreSQL or Aurora PostgreSQL. AWS generally only supports extensions that ship with PostgreSQL (and maybe some limited third-party extensions?). The lists of supported extensions are here:

* https://docs.aws.amazon.com/AmazonRDS/latest/PostgreSQLRelea...

* https://docs.aws.amazon.com/AmazonRDS/latest/AuroraPostgreSQ...

>>__mhar+l6
Apache Ballista and Polars do Apache Arrow and SIMD.

The Polars homepage links to the "Database-like ops benchmark" of {Polars, data.table, DataFrames.jl, ClickHouse, cuDF*, spark, (py)datatable, dplyr, pandas, dask, Arrow, DuckDB, Modin,} but not yet PostgresML? https://h2oai.github.io/db-benchmark/

>>montan+b7
For anyone who skips the intro and just goes to the results, this is what they see: https://imgur.com/tEK73e8

A suggestion: clean up the blog post's charts and headers to make it much, much more clear that what's being compared isn't python vs postgresml.

>>redbel+(OP)
If you are trying to compare the performance as a ML service, maybe you should try to compare it with other ML model serving frameworks like https://github.com/mosecorg/mosec or https://github.com/bentoml/BentoML. Flask/FastAPI are not built for ML services.

>>redbel+(OP)
This benchmark is not very useful. To get any real insights, you'd have to benchmark every single line of the prediction function (called "api") to see where the slowdown is actually coming from https://github.com/postgresml/postgresml/blob/15c8488ade86b0...

Everything else is just speculation.

>>vasco+v8
You don't need those hops if you use Python either. Python runs inside Postgres.

https://www.postgresql.org/docs/current/plpython.html

Naturally Rust or C functions will still be faster.

>>learnd+Mk
A good start would be to not do silly things like

    body = request.json
    key = json.dumps(body)

in the prediction code to begin with: https://github.com/postgresml/postgresml/blob/15c8488ade86b0...

>>FreakL+vr
PostgresML v1.0 was doing exactly that. When we rewrote in Rust for v2.0, we improved 35x: https://postgresml.org/blog/postgresml-is-moving-to-rust-for...

zlacker

PostgresML is 8-40x faster than Python HTTP microservices