zlacker

[return to "LLM with Planning"]

>>mercat+(OP)
Sometimes i wonder if text generation could be formulated as a planning/optimization problem and if that facility could solve embedded planning problems as a byproduct.

>>PaulHo+81
RL in ChatGPT is used for that: to generate text that maximizes reward. But if you have other domains with their reward functions, then you could plan on them

[go to top]