When talking about light or radio waves, you're probably talking about Classical Electrodynamics, which does not include the notion of a photon. In Classical EM, light is an EM wave in the sense that if you look at the electric field E or magnetic field B (of a plane wave) at a fixed point in space it is varying sinusoidally. So the waviness is in the amplitude of the E and B fields.
Once you talk about photons, you're in the realm of Quantum Mechanics (QM), and yes things are harder to understand.
It's actually all just fields according to the Standard Model (particle physics), a quantum field theory (QFT).
In QFT there's a field for each fundamental particle that permeates the whole universe. E.g. an electron field, a photon field, etc. Disturbances in these fields are what one would call particles in non-relativistic QM.
So Classical -> QM (quantum system, classical observer/apparatus) -> QFT (quantum everything)
In classical EM, light is a wave. In QM, light is particles. In QFT particles are just disturbances in the all-pervading fields.
Binney has said that QM is just measurement for grownups, or some such. What is a measurement? It's when the system you're observing becomes entangled with the measuring device. We don't know the exact state of every atom in our measuring device, but these could all perturb the system we're measuring. So QM is a hack where you treat the system as quantum but the observer/measuring device as classical which is why you need this confusing wave-function collapse. It was a conscious choice in the development of the theory. This last bit might give some insight into why trying to sense the photon at one of the double-slits ruins the interference pattern.
Because it's wrong. It's a quantum of the electromagnetic field. It's neither a wave nor a particle. It just happens to have some properties of both.
But for the duality, there's something bigger that the responses always seem to blow past. Is wave-like nature for explaining behavior (wavy double-slit intensity pattern), or is it something to have a mathematical mapping to measured probabilities?
Quantum stories always seem so backwards. The root phenomenon is some sort of irreducible probability. But then the mechanical part (inference in double-slit) goes a totally different direction. Instead of just turning the situation into a probability of one-or-the-other slit, it STAYS as a wave.
Okay, now you have a new hole in the story. If the photon refuses to choose just 1 slit to go through, why does it choose 1 spot on the photo paper to land on?
Why do we not still have to consider interference in outcomes after the photon makes its mark on the paper? Why does there appear to be like a limit on entanglement, such that it goes away beyond a certain scale? Why are quantum computers hard?
The photon (as a field excitation) goes through both slits, but is quantized so only has enough energy to trigger a mark at 1 spot on the photo paper.
> Why do we not still have to consider interference in outcomes after the photon makes its mark on the paper?
If we want to be completely accurate, we should. However so many interactions happen so quickly that the law of large numbers quickly takes over and obfuscates the quantum reality. Technical term for this is decoherence.
> Why are quantum computers hard?
Exactly because of this decoherence. It is very difficult to keep the qubit state isolated from the environment throughout the computation.