For example, a Twitch streamer enters responses into their stream-chat with a live mic. Later, the streamer enters their Twitch password. Someone employing this technique could reasonably be able to learn the audio from the first scenario, and apply the findings in the second scenario.
Even if you flip a few letters from something like the above a human attacker will easily be able to fix it manually.
"horswstaplevatterucorrect" for example is still intelligible.
You don’t need to guess every character.