It's not obvious whether there's any automated way to reliably detect the difference between "use of HDR" and "abuse of HDR". But you could probably catch the most egregious cases, like "every single pixel in the video has brightness above 80%".
My idea is: for each frame, grayscale the image, then count what percentage of the screen is above the standard white level. If more than 20% of the image is >SDR white level, then tone-map the whole video to the SDR white point.