zlacker

[parent] [thread] 3 comments
1. PaulHo+(OP)[view] [source] 2023-04-27 22:44:00
Sometimes i wonder if text generation could be formulated as a planning/optimization problem and if that facility could solve embedded planning problems as a byproduct.
replies(2): >>qumpis+g6 >>YeGobl+Nw1
2. qumpis+g6[view] [source] 2023-04-27 23:32:45
>>PaulHo+(OP)
RL in ChatGPT is used for that: to generate text that maximizes reward. But if you have other domains with their reward functions, then you could plan on them
replies(1): >>PaulHo+Pf
◧◩
3. PaulHo+Pf[view] [source] [discussion] 2023-04-28 01:09:21
>>qumpis+g6
My impression is that the complex optimization happens during training but that the actual inference is using some kind of greedy algorithm like beam search. If the inference algorithm was using simulated annealing or something like that that would be different.
4. YeGobl+Nw1[view] [source] 2023-04-28 13:57:19
>>PaulHo+(OP)
Yes, it seems like casting story-generation as a planning problem was a standard approach, at least in the recent past (I'm guessing everyone is turning to LLMs now):

Story planners start with the premise that the story generation process is a goal-driven process and apply some form of symbolic planner to the problem of generating a fabula. The plan is the story.

https://thegradient.pub/an-introduction-to-ai-story-generati...

As an aside, it is obvious from that The Gradient article I link above that story generation was doing just fine until LLMs came along and claimed they can do it right for the first time ever. I can see that the earlier approaches took some careful hand-engineering, but they also seemed to more reliably generate coherent stories that made sense (although it looks like maybe they didn't have very rich themes and development etc). But then, that's the trade-off you get between classical approaches and big machine learning: either you have to roll up those sleeves and use some elbow grease, or you have to label giant reams of data and pay the giant price of compute needed to train on them. In a sense, the claimed advance of deep learning is that domain experts can be replaced by cheaply paid inexpert labellers, plus some very big GPU clusters.

[go to top]