zlacker

[return to "Predicting the Future of Distributed Systems"]
1. willva+yo[view] [source] 2024-08-27 06:31:17
>>borisj+(OP)
Something I anticipate is smarter storage that can do some filtering on push down predicates. There's compute on the storage nodes that is being wasted today.

I was kinda expecting BigQuery to do this under the hood, but it seems like they don't, which is a shame. BigQuery isn't faster than, say, trino on gcs, even though Google could do some major optimisations here.

◧◩
2. levent+Eib[view] [source] 2024-08-31 06:32:58
>>willva+yo
BigQuery Storage Read API claims to support filters and simple projections pushed down to the storage: https://cloud.google.com/bigquery/docs/reference/storage. See also this recent paper: https://research.google/pubs/biglake-bigquerys-evolution-tow...

I've also recently proposed a Table Read protocol that should be a "non-vendor-controlled" equivalent of BigQuery Storage APIs: https://engineeringideas.substack.com/p/table-transfer-proto...

[go to top]