zlacker

[return to "LLM with Planning"]
1. PaulHo+81[view] [source] 2023-04-27 22:44:00
>>mercat+(OP)
Sometimes i wonder if text generation could be formulated as a planning/optimization problem and if that facility could solve embedded planning problems as a byproduct.
◧◩
2. qumpis+o7[view] [source] 2023-04-27 23:32:45
>>PaulHo+81
RL in ChatGPT is used for that: to generate text that maximizes reward. But if you have other domains with their reward functions, then you could plan on them
[go to top]