zlacker

[return to "Predicting the Future of Distributed Systems"]
1. willva+yo[view] [source] 2024-08-27 06:31:17
>>borisj+(OP)
Something I anticipate is smarter storage that can do some filtering on push down predicates. There's compute on the storage nodes that is being wasted today.

I was kinda expecting BigQuery to do this under the hood, but it seems like they don't, which is a shame. BigQuery isn't faster than, say, trino on gcs, even though Google could do some major optimisations here.

◧◩
2. okr+Qv[view] [source] 2024-08-27 08:13:29
>>willva+yo
I also wonder if Athena does this with AWS. Parquet supports pushdown. But i would suspect, pushdown predicates would mean that the file storage unit has to have some logic to execute custom code, bringing back the code to the data. The promise of spark, once. It would be a huge win, definitly. Hmmm.

But opens up also a threat vector. And you have competing users running their predicates. So one has to think also about queues and pipelining and so on. But probably also solvable, just like on any multiuser system.

Interesting.

[go to top]