zlacker

[parent] [thread] 1 comments
1. __mhar+(OP)[view] [source] 2022-10-20 01:58:24
Spacy is much faster on the GPU. Many folks don't know that Cudf (a Pandas implementation for GPUs) parallelizes string operations (these are notoriously slow on Pandas)... shrug...
replies(1): >>westur+aa
2. westur+aa[view] [source] 2022-10-20 03:48:06
>>__mhar+(OP)
Apache Ballista and Polars do Apache Arrow and SIMD.

The Polars homepage links to the "Database-like ops benchmark" of {Polars, data.table, DataFrames.jl, ClickHouse, cuDF*, spark, (py)datatable, dplyr, pandas, dask, Arrow, DuckDB, Modin,} but not yet PostgresML? https://h2oai.github.io/db-benchmark/

[go to top]