zlacker

[parent] [thread] 3 comments
1. xpe+(OP)[view] [source] 2024-08-25 03:37:12
Re #2: I prefer https://pola.rs over Pandas
replies(1): >>sagarm+ka
2. sagarm+ka[view] [source] 2024-08-25 06:03:24
>>xpe+(OP)
I've heard great things about Pola.rs performance. To get there, they have a lazy evaluation so they can see more of the computation at once, allowing them to implement optimizations similar to those in a SQL engine.
replies(1): >>xpe+lV5
◧◩
3. xpe+lV5[view] [source] [discussion] 2024-08-27 12:44:10
>>sagarm+ka
In the early days, even as I appreciated what Pandas could do, I never found its API sane. Pandas has too many special cases and foot-guns. It is a notorious case of poor design.

My opinion is hardly uncommon. If you read over https://www.reddit.com/r/datascience/comments/c3lr9n/am_i_th... you will find many in agreement. Of those who "like" Pandas, it is often only a relative comparison to something worse.

The problems of the Pandas API were not intrinsic nor unavoidable. They were poor design choices probably caused by short-term thinking or a lack of experience.

Polars is a tremendous improvement.

replies(1): >>sagarm+018
◧◩◪
4. sagarm+018[view] [source] [discussion] 2024-08-28 00:55:41
>>xpe+lV5
Hey, I agree with you.

On eager vs lazy evaluation -- pytorch defaulting to eager seemed to be part of the reason it was popular. Adding optional lazy evaluation to improve performance later seems to have worked for them.

[go to top]