https://blog.cryptographyengineering.com/2014/11/27/zero-kno... was a good intro for interactive ZK proofs but I haven't been able to find something for non-interactive ones.
This blog post comparing ZK-STARKs to erasure coding is in the right flavor but didn't quite stick to my brain either: https://vitalik.eth.limo/general/2017/11/09/starks_part_1.ht...
https://blog.cryptographyengineering.com/2014/11/27/zero-kno...
Usually in an IP, the prover (Bob) has to answer questions from the verifier (Alice), and Alice chooses her questions by flipping a coin. If the Bob doesn’t really know the answer, he’ll get caught cheating with high probability.
So now the trick: Bob starts generates his initial answer. Then he hashes it (“commits” in the jargon), and uses the hash as “Alice’s first coin flip”. Then he answers the question for that flip, hashes the whole thing for “Alice’s second coin flip”… etc.
Bob does this say, 100 times, and then sends the whole simulated conversation to Alice. Alice can verify that he didn’t cheat by checking the intermediate hashes.
The whole thing depends on the ability to not control the result of the hash function, so it’s vital to use a cryptographically secure one.
But the number of questions you need to compensate grows only a little.
For example, interactively if you ask for Merkle tree proof that selected leaf values have a particular property, you only have to ask for about k leaves to get probability 1-(2^-k) that you'd catch a dishonest prover who had committed a Merkle root with less than half the leaves having the property.
Non-interactively, a dishonest prover could secretly grind attempts, say 2^g times, and then you'd have a lower probability of catching them, approximately 1-(2^(g-k)). But g can't be all that large, so you can increase k to compensate without making the proof much larger.
You.can also require certain hashes to have a fixed prefix, like Bitcoin mining, forcing every prover to have to grind 2^p times. This reduces the effective g that a dishonest prover can achieve, allowing k to be smaller for the same security, so allowing the non-interactive proof to be smaller. At the cost of honest provers having to grind.