Recent progress in automated reasoning and supercomputing
gives rise to a new era of brute force.
The game changer is “SAT,” a disruptive, brute-reasoning
technology in industry and science. We illustrate its
strength and potential via the proof of the Boolean
Pythagorean Triples Problem, a long-standing open
problem in Ramsey Theory. This 200TB proof has been
constructed completely automatically—paradoxically,
in an ingenious way. We welcome these bold new proofs
emerging on the horizon, beyond human understanding—
both mathematics and industry need them.
Not only do you need to verify the code, you also need to generate the proof multiple times since there is a significant possibility of hardware error with that large of a dataset.
You should verify the proof multiple times, but there's no reason to generate it multiple times. If there was an error in the witness generation proper verification will catch it, and if there's a problem with the verification regenerating the witness will not help.
I think it would be pretty unlikely to trigger rowhammer accidentally. The incidence of random (memory) errors is very low. Something like once every 1.5 years...
Our first observation is that memory errors are not rare
events. About a third of all machines in the fleet experience
at least one memory error per year (see column CE Incid.
%) and the average number of correctable errors per year
is over 22,000. These numbers vary across platforms, with
some platforms (e.g. Platform A and B) seeing nearly 50% of
their machines affected by correctable errors, while in others
only 12–27% are affected. The median number of errors per year for those
machines that experience at least one error ranges from 25 to 611.
With a proof checker. They're usually tiny for easier verification. Some are trying to verify the verifiers. HOL/Light and Jared Davis's Milawa esp come to mind. Milawa's layered approach is pretty neat.
There is a verified DRAT-trim (the proof checking format used) implementation in Isabelle/HOL [1]. Apparently it is faster than the original:
``We have presented a formally verified tool chain to check DRAT unsatisfiability certificates. In single-threaded mode, our approach is more than two times faster than the (unverified) standard tool drat-trim, on a benchmark suite taken from the 2016 SAT competition.''
[1] Lammich, Peter. "Efficient verified (UN) SAT certificate checking." International Conference on Automated Deduction. Springer, Cham, 2017.
Most SAT solvers work similarly. The authors extended a technique (RUP to DRUP to DRAT) that creates a co-witness for the SAT problem. (This witness is not polynomial in the input size but proportional to the solving time.) The SAT solvers can output this format without slowdown. It is essentially printing some data structures (learned clauses) in the non-hot loop. The rest is clever heuristics, parallelization and work stealing.
The new technique creates smaller and easier witnesses. The solver they used output this witness during solving and saved it. They also created a formally-verified witness verifier. Finally run the verifier for every witness in the sub-problem.
200 TB is not a lot, though, for a single pass validity check today. A core does roughly a billion operations per second and we have tools now for easy provisioning of computation power and distribution of work.
-------------------------
Recent progress in automated reasoning and supercomputing gives rise to a new era of brute force. The game changer is “SAT,” a disruptive, brute-reasoning technology in industry and science. We illustrate its strength and potential via the proof of the Boolean Pythagorean Triples Problem, a long-standing open problem in Ramsey Theory. This 200TB proof has been constructed completely automatically—paradoxically, in an ingenious way. We welcome these bold new proofs emerging on the horizon, beyond human understanding— both mathematics and industry need them.