Not arguing that; I'm just saying I don't know that KL divergence does or is responsible for this, and I haven't seen any compelling argument that increasing the KL term would fix it.
There's no question the OP found a legit issue. The questions are more like:
1) What caused it?
2) How do you fix it?
3) What result would fixing it actually have?