As always, one of the most difficult parts is getting good features and data. In this case one difficulty is measuring and defining the reaction time to begin with.
In Counter Strike you rely on footsteps to guess if someone is around the corner and start shooting when they come close. For far away targets, lots of people camp at specifc spots and often shoot without directly sighting someone if they anticipate someone crossing - the hit rate may be low but it's a low cost thing to do. Then you have people not hiding too well and showing a toe. Or someone pinpointing the position of an enemy based on information from another player. So the question is, what is the starting point for you to measure the reaction?
Now let's say you successfully measured the reaction time and applied a threshold of 80ms. Bot runners will adapt and sandbag their reaction time, or introduce motions to make it harder to measure mouse movements, and the value of your model now is less than the electricity needed to run it.
So with your proposal to solve the reaction time problem with KL divergence. Congratulations, you just solved a trivial statistics problem to create very little business value.
You arent eliminating cheaters, that's impossible, you are limiting their impact.
My world is pretty fine, as I don't play games on servers, without active admin/mods that kick and ban people who obviously cheat.
ML solutions can maybe help here, but I believe they can reliable detect cheats, without banning also lucky or skilled players, once I see it.
This is one of the cases where ML methods seem appropriate.