I'm trying to figure the same thing out for my stuff. I figured out a simple way to train location prediction so I'm using it for guided window prediction which is great for attn (predict a distance in the past to look at) and for memory (predict an x, y location for a 2d window into a memory store to look at that will be helpful). I suspect there are a lot of people out there that have found that one weird trick but haven't released it because they don't know how to capitalize on the idea. Why give OpenAI and others the keys to the future for free?