for reference, we're aiming for 1-100 GB / second, per server, in our python etl+ml+viz pipelines
interestingly, duckdb+polars are nice for small non-etl/ml perf, but once it's analytical processing, we use cudf / dask_cudf for much more perf per watt / $. I'd love the low overhead & typing benefits of polars, but as soon as you start looking at GB+/s and occasional bigger-than-memory, the core sw+hw needs to change a bit, end-to-end
(and if folks are into graph-based investigations, we're hiring backend/infra :) )