zlacker

[parent] [thread] 0 comments
1. cgearh+(OP)[view] [source] 2025-06-25 01:04:45
The current gen VLA architectures include some tricks (like compressed action tokenization and diffusion decoding) to reach action frequencies between 50-200hz. I think they’re _more_ efficient this way than regular LLMs trying to do everything thru text.
[go to top]