[1] https://www.width.ai/post/what-is-beam-search
It is possible to even have 3-gram model to output better text predictions if you combine it with the beam search.