Abstract: ------------------------- Recent progress in automated reasoning and s...

maxxxxx · on Aug 1, 2017

I tend to get sceptical when I hear the word "disruptive" to describe something.

Xeoncross · on Aug 1, 2017

> This 200TB proof

How do you verify a 200TB proof?

slaymaker1907 · on Aug 1, 2017

Not only do you need to verify the code, you also need to generate the proof multiple times since there is a significant possibility of hardware error with that large of a dataset.

johncolanduoni · on Aug 1, 2017

You should verify the proof multiple times, but there's no reason to generate it multiple times. If there was an error in the witness generation proper verification will catch it, and if there's a problem with the verification regenerating the witness will not help.

noir_lord · on Aug 1, 2017

..and then rewrite it again via a clean room approach and run that one and see if they tally.

Preferably on different hardware.

dsacco · on Aug 1, 2017

The Pi (and other mathematical constant) computation records don't go that far. They verify random results towards the end with a different algorithm.

IncRnd · on Aug 2, 2017

Excellent point.

Hardware error is almost certain in this case. c.f. rowhammer or bitsquatting.

new299 · on Aug 2, 2017

I think it would be pretty unlikely to trigger rowhammer accidentally. The incidence of random (memory) errors is very low. Something like once every 1.5 years...

This study from Google is pretty interesting:

http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf

IncRnd · on Aug 2, 2017

I compared to rowhammer and bitsquatting.

From the paper that you cited:

Our first observation is that memory errors are not rare events. About a third of all machines in the fleet experience at least one memory error per year (see column CE Incid. %) and the average number of correctable errors per year is over 22,000. These numbers vary across platforms, with some platforms (e.g. Platform A and B) seeing nearly 50% of their machines affected by correctable errors, while in others only 12–27% are affected. The median number of errors per year for those machines that experience at least one error ranges from 25 to 611.

Confusion · on Aug 1, 2017

You have hecksums and other internal consistency checks to deal with those.

AnimalMuppet · on Aug 1, 2017

They won't actually help. With hecksums, you're darned if you do, and darned if you don't.

;-)

nickpsecurity · on Aug 1, 2017

With a proof checker. They're usually tiny for easier verification. Some are trying to verify the verifiers. HOL/Light and Jared Davis's Milawa esp come to mind. Milawa's layered approach is pretty neat.

http://www.cs.utexas.edu/users/jared/milawa/Web/

rrobukef · on Aug 2, 2017

There is a verified DRAT-trim (the proof checking format used) implementation in Isabelle/HOL [1]. Apparently it is faster than the original:

``We have presented a formally verified tool chain to check DRAT unsatisfiability certificates. In single-threaded mode, our approach is more than two times faster than the (unverified) standard tool drat-trim, on a benchmark suite taken from the 2016 SAT competition.''

[1] Lammich, Peter. "Efficient verified (UN) SAT certificate checking." International Conference on Automated Deduction. Springer, Cham, 2017.

nickpsecurity · on Aug 2, 2017

Neat. Here's two more on that topic that go further:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.459....

http://matryoshka.gforge.inria.fr/pubs/sat_sister.pdf

rrobukef · on Aug 2, 2017

Most SAT solvers work similarly. The authors extended a technique (RUP to DRUP to DRAT) that creates a co-witness for the SAT problem. (This witness is not polynomial in the input size but proportional to the solving time.) The SAT solvers can output this format without slowdown. It is essentially printing some data structures (learned clauses) in the non-hot loop. The rest is clever heuristics, parallelization and work stealing.

The new technique creates smaller and easier witnesses. The solver they used output this witness during solving and saved it. They also created a formally-verified witness verifier. Finally run the verifier for every witness in the sub-problem.

SilasX · on Aug 1, 2017

Same way as any proof, right? Check that each step is valid; in this case, that the symbol manipulation obeyed the rules.

noobermin · on Aug 2, 2017

What a mathematician[0] thing to say. This sounds like an isomorphism of "it's solvable in principle."

[0] Speaking as a scientist.

SilasX · on Aug 2, 2017

200 TB is not a lot, though, for a single pass validity check today. A core does roughly a billion operations per second and we have tools now for easy provisioning of computation power and distribution of work.

johncolanduoni · on Aug 1, 2017

Usually by trusting/auditing/verifying the program you use to verify the proof. Which will be much easier to do by hand than 200TB worth of data.

wiz21c · on Aug 2, 2017

You prove that your SAT solver is correct. Then you prove that the way you expressed the problem is correct.

I'm not in that SAT stuff enough to know if one can prove that :-)

falcolas · on Aug 1, 2017

Generate another proof (using a different method; Coq?) of the proof? That's all I can think of.

justinpombrio · on Aug 1, 2017

With software.