zlacker

[parent] [thread] 1 comments
1. montan+(OP)[view] [source] 2022-10-20 04:50:32
Low effort comment that didn't read the post.

- Multiple formats were compared

- Duckdb is not a production ready service

- Pandas isn't used

You seem to be trolling.

replies(1): >>learnd+v
2. learnd+v[view] [source] 2022-10-20 04:56:55
>>montan+(OP)
How would I be able to respond to the post in detail if I didn't read it? What a bizarre, defensive response. To address your points:

- Multiple formats were compared

Yes, but not a zero-copy or efficient format, like flatbuffer. It was mentioned as one of the highlights of postgresML:

> PostgresML does one in-memory copy of features from Postgres

> - Duckdb is not a production ready service

What issues did you have with duckdb? Could use some other in-memory store like Plasma if you don't like duckdb.

> - Pandas isn't used

that was responding to the point in the post:

> Since Python often uses Pandas to load and preprocess data, it is notably more memory hungry. Before even passing the data into XGBoost, we were already at 8GB RSS (resident set size); during actual fitting, memory utilization went to almost 12GB.

> You seem to be trolling.

By criticizing the blog post?

[go to top]