For example, in NLP a huge amount of pre and post processing of data is needed outside of the GPU.
The Polars homepage links to the "Database-like ops benchmark" of {Polars, data.table, DataFrames.jl, ClickHouse, cuDF*, spark, (py)datatable, dplyr, pandas, dask, Arrow, DuckDB, Modin,} but not yet PostgresML? https://h2oai.github.io/db-benchmark/