1. Crowdsourced word weighting: your keyboard's stochastic predictions are no longer mostly based on your typing, but rather on what 'everyone' is typing as their next word. This makes the word replacements it does often suboptimal to downright nonsensical.
2. Aggressive lookbehind correction: these days you have to be seriously on your guard for your keyboard to not sneak-edit something you typed 5 words back, because autocorrect suddenly decided that the probability is high you meant to say something else there (which it clearly isn't, as your eyes and brain exist)
The problem your encountering is downstream from point 1. Basically your keyboard thinks due to the way most people construct a particular sentence, you're gonna want to type "bold" next, despite "hold" clearly clearly making more sense. So it'll force "b" on you 4 times in a row until it realizes you really want to type "h".
Going back to the old style of doing keyboards (mostly user-learned dictionaries and probability weighting, and little lookbehind autocorrrect) could be done, but within Google and Apple there are probably people who got promoted by switching to the current shitty system. They'll block off any attempt at someone messing with their pride.
(There is a third 'problem' where your visual keys do not correspond to the touchmap at all. Swiftkey has a feature where it can show you what your touchmap and heatmap look like versus the actual layout and it its often staggeringly different, with many keys vastly tilted. When you try to desperately type "h" after 4 misses, you're doing that with your index finger in "hunt and peck" mode, which does correspond to the visual layout but not with your usual typing on the touchmap layout. There is no way for your keyboard to know you're in "hunt and peck" accuracy mode.)
In the video, the user is typing 'Thumbs up', and when they get to the first 'u' the keyboard shows a 'u' being pressed but a 'j' is inserted instead. Are you suggesting that, due to the way most people construct sentences, the OS thinks that 'thjmbs' is the most likely word? And then the next time the OS thinks that 'thhmbs' is the most likely word?
Both of the issues you've mentioned are common, and irritating, but if you watch the video you can see that that's not what's happening here. Before any autocorrection or adjustment is being done, the keyboard is registering a 'U' and the OS is inputting a J or H or I or some other nearby letter.
The video also debunks the touchmap discontinuity issues as well, because you can clearly see which key the keyboard is registering; it's not assuming that you meant to press J or it would highlight the J; it's registering a U, highlighting U, and inputting J.
It sounds to me as though you didn't watch the video and just assumed what issue was being discussed; please do watch it, because this is another, relatively new, issue that lots of people have seen and which is far worse and more frustrating than the other legitimate issues you mentioned.
In addition to the other problems (the keyboard being too prone catching extremely subtle slides below UI response time), there certainly is the problem of when you crowd source enough data you crowd source all of their collective mistakes, too. In a lot of that raw data mistakes are going to be as common or more common than corrections and/or originally correct spellings.
We do have a great filter for this called a "dictionary", but as the above commenter laments companies have given up on "just autocorrect to dictionary words" for much more complex "learning" models and filtering them back to just dictionary words is antithetical to the (sunken cost) expense that went into training these models, and/or the KPIs and promotion incentives that keep prioritizing "AI" and giant crowd sourced data vats over simpler mechanics and local user specifics.