Broken Proofs and Broken Provers

submitted by RebelP+(OP) on 2026-02-04 09:00:09 | 63 points 14 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts

>>N_Lens+EC
In the described case, this was a simple user error. But you are right nonetheless: To enable the concurrency, the system uses a parallel inference kernel (https://www21.in.tum.de/~wenzelm/papers/parallel-isabelle.pd...). I have heard the anecdote that in the first version, this also allowed one to prove false, by fulfilling a proof promise with a proof that depended recursively on the same promise.

And there are verification tools for verification tools! Isabelle has a verified proof checker that can check whether a given proof is sound: https://www21.in.tum.de/~rosskops/papers/metalogic_pre.pdf The only caveat is that the proof checker itself is verified in Isabelle...

>>RebelP+(OP)
I still remember the time when a gcc bug caused MiniSat to output UNSAT for a satisfiable problem [1]. I was the author of a SAT solver, and I was chasing my tail trying to figure out why I was getting UNSAT for a satsifiable problem. I have to admit I didn't expect it to be a gcc bug... (note: bug was found by Vegard Nossum on the CryptoMiniSat mailing list)

[1] https://www.msoos.org/2013/04/gcc-4-5-2-at-sat-competition-2...

>>zozbot+Pg1
My understanding is that tactics output piece of proof, but that this isn't trusted by the TCB of an LCF prover. This is very far from my area of expertise, but seems to be confirmed by https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-39.html

>>zozbot+Pg1
This is wrong as the others replies also point out. Tactics in LCF-style provers are not part of the TCB. Here is an example of the TCB for an industrial strength prover:

○ https://github.com/jrh13/hol-light/blob/master/fusion.ml

A mere 676 LoCs. This miniscule size of the TCB is where the comparatively lesser bug count (despite intense use) comes from.

>>Dacit+hs1
At least in Isabelle/ML, it seems like in practice there are also untrusted "oracles" that a proof can invoke to generate "thm" objects [0], and it's not entirely trivial to automatically ensure that only trusted sources are used in a project [1], unless I am misunderstanding the linked thread.

Of course there's no free lunch, as you say, in making sure that high-level proofs are lowered into the trusted part correctly, but it's certainly a piece that should ideally be as simple as possible.

[0] https://lists.cam.ac.uk/sympa/arc/cl-isabelle-users/2025-02/...

[1] https://lists.cam.ac.uk/sympa/arc/cl-isabelle-users/2025-02/...

zlacker

Broken Proofs and Broken Provers