Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Abstract:

-------------------------

Recent progress in automated reasoning and supercomputing gives rise to a new era of brute force. The game changer is “SAT,” a disruptive, brute-reasoning technology in industry and science. We illustrate its strength and potential via the proof of the Boolean Pythagorean Triples Problem, a long-standing open problem in Ramsey Theory. This 200TB proof has been constructed completely automatically—paradoxically, in an ingenious way. We welcome these bold new proofs emerging on the horizon, beyond human understanding— both mathematics and industry need them.



I tend to get sceptical when I hear the word "disruptive" to describe something.


> This 200TB proof

How do you verify a 200TB proof?


Not only do you need to verify the code, you also need to generate the proof multiple times since there is a significant possibility of hardware error with that large of a dataset.


You should verify the proof multiple times, but there's no reason to generate it multiple times. If there was an error in the witness generation proper verification will catch it, and if there's a problem with the verification regenerating the witness will not help.


..and then rewrite it again via a clean room approach and run that one and see if they tally.

Preferably on different hardware.


The Pi (and other mathematical constant) computation records don't go that far. They verify random results towards the end with a different algorithm.


Excellent point.

Hardware error is almost certain in this case. c.f. rowhammer or bitsquatting.


I think it would be pretty unlikely to trigger rowhammer accidentally. The incidence of random (memory) errors is very low. Something like once every 1.5 years...

This study from Google is pretty interesting:

http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf


I compared to rowhammer and bitsquatting.

From the paper that you cited:

Our first observation is that memory errors are not rare events. About a third of all machines in the fleet experience at least one memory error per year (see column CE Incid. %) and the average number of correctable errors per year is over 22,000. These numbers vary across platforms, with some platforms (e.g. Platform A and B) seeing nearly 50% of their machines affected by correctable errors, while in others only 12–27% are affected. The median number of errors per year for those machines that experience at least one error ranges from 25 to 611.


You have hecksums and other internal consistency checks to deal with those.


They won't actually help. With hecksums, you're darned if you do, and darned if you don't.

;-)


With a proof checker. They're usually tiny for easier verification. Some are trying to verify the verifiers. HOL/Light and Jared Davis's Milawa esp come to mind. Milawa's layered approach is pretty neat.

http://www.cs.utexas.edu/users/jared/milawa/Web/


There is a verified DRAT-trim (the proof checking format used) implementation in Isabelle/HOL [1]. Apparently it is faster than the original:

``We have presented a formally verified tool chain to check DRAT unsatisfiability certificates. In single-threaded mode, our approach is more than two times faster than the (unverified) standard tool drat-trim, on a benchmark suite taken from the 2016 SAT competition.''

[1] Lammich, Peter. "Efficient verified (UN) SAT certificate checking." International Conference on Automated Deduction. Springer, Cham, 2017.



Most SAT solvers work similarly. The authors extended a technique (RUP to DRUP to DRAT) that creates a co-witness for the SAT problem. (This witness is not polynomial in the input size but proportional to the solving time.) The SAT solvers can output this format without slowdown. It is essentially printing some data structures (learned clauses) in the non-hot loop. The rest is clever heuristics, parallelization and work stealing.

The new technique creates smaller and easier witnesses. The solver they used output this witness during solving and saved it. They also created a formally-verified witness verifier. Finally run the verifier for every witness in the sub-problem.


Same way as any proof, right? Check that each step is valid; in this case, that the symbol manipulation obeyed the rules.


What a mathematician[0] thing to say. This sounds like an isomorphism of "it's solvable in principle."

[0] Speaking as a scientist.


200 TB is not a lot, though, for a single pass validity check today. A core does roughly a billion operations per second and we have tools now for easy provisioning of computation power and distribution of work.


Usually by trusting/auditing/verifying the program you use to verify the proof. Which will be much easier to do by hand than 200TB worth of data.


You prove that your SAT solver is correct. Then you prove that the way you expressed the problem is correct.

I'm not in that SAT stuff enough to know if one can prove that :-)


Generate another proof (using a different method; Coq?) of the proof? That's all I can think of.


With software.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: