Not specific to this article, but it's tragic that computer science curricula, and discussions of these algorithms, virtually never highlight the tight connection between binary search and long division. Long division done in binary is exactly binary search for the correct quotient, with the number written above the line (plus some implicit zeros) being the best lower bound proven thus far. Similarly, division done in decimal is just "ten-ary" search.
Neat. Division being iterative makes me feel better about last week when I couldn't think up a good closed-form solution for something and I didn't feel like computing gradients so I just made a coordinate descent loop. It'll be funny if it's still in there when it ships
I studied “computation” at Oxford 92-95 while CAR Hoare was still there and basically all they taught us about computers were these formal methods. They actually thought it would revolutionise programming and looked down at anything else as “hand-waving”. Think the syllabus was abandoned only a few years later thank goodness.
Interesting to hear your experience. I was there 94–97, when the curriculum was still pretty heavy on formal methods (and functional programming).
For me it was wonderful. I already knew how to write computer programs, as that's what I had spent most of my waking hours doing before that point, but learning how to think rigorously about the correctness of programs was a revelation to me.
I studied "Mathematics and Computation" there in 89-92 because they seemed to think that "computation" was a fad that was probably going away, so you couldn't let people study just "Computation".
There was a certain amount of formal methods, but only in a perfunctory manner, as if to satisfy an inconvenient requirement. Some functional programming, but in an extremely shallow way. Overall, I did not learn a single useful thing in 3 years.
I followed this up with Ph.D. in Computer Science somewhere else, which was also a complete waste of time.
Not a single useful thing? A complete waste of time? Really not anything that you found intriguing or thought-provoking for its own sake? I did the same course nine years later and found it very stimulating, even if it wasn't always directly "applicable" so to speak. But perhaps the syllabus had changed a bit by then.
What I learned from these comments sharing their experiences from one time to another is that the curriculum evolved and no one had the same experience.
Not to mention how individuals perceive things - two students with similar aptitude in the same class can still have their own very different experiences.
Still a strong presence in 98-01 but also lots of functional programming, algorithms and complexity and the tail end of the parallel research boom. I loved it.
A couple of decades ago, I attended a lecture by Tony Hoare about program correctness. He used the producer-consumer problem as an example and by starting with pre/post conditions and refining them, he arrived at a (proved) correct program. I went back from the lecture thinking this is so easy, I can apply this as well. Sadly, I realized one has to be Tony Hoare to produce code out of logic with his level of ease.
We actually proved correctness of the fast Long division of Scala.js using similar reasoning. [1] We had mechanical proofs for the other operations, but the solver couldn't handle division. The interplay between bit vector semantics and floating point semantics was too much for it to bear.
Scala.js has the fastest implementation of 64-bit integers on the JavaScript engine. The implementations cannot be understood unless you actually prove them correct. IMO that's one of the best reasons to invest in the formal proof of a program fragment.
> Interestingly, it seems division by zero is impossible, because there is no suitable remainder.
Excercise: However, this algorithm still terminates and computes something, when provided with d == 0. What does it compute? Strengthen the postcondition with this information and adjust the proof accordingly.
i think this shows why formal verification, while certainly useful in specialized situations, will likely not be a major boost for software productivity[1]
> I do not believe we will find the magic here. Program verification is a very powerful concept, and it will be very important for such things as secure operating system kernels. The technology does not promise, however, to save labor. Verifications are so much work that only a few substantial programs have ever been verified
My high school CS teacher in the 1990s had been influenced by this and told me that formal proofs of correctness (like with loop invariants, as here) were extremely tedious and that few people were willing to actually go through with them outside of a formal methods class.
This was a perennial topic of Dijkstra's as he complained about people writing programs without knowing why they worked, and, often, writing programs that would unexpectedly fail on some inputs.
But we're getting better tools nowadays. The type systems of recent languages already capture a lot of important behaviors, like nullability. If you can't convince the type checker that you've handled every input case, you probably do have a bug. That's already insight coming from formal methods and PL theory, even when you aren't thinking of yourself as writing down an explicit proof of this property.
I guess the compromise is that the more routine machine-assisted proofs that we're getting from mainstream language type checkers are ordinarily proofs of weaker specifications than Dijkstra would have preferred, so they rule out correspondingly narrower classes of bugs. But it's still an improvement in program correctness.
My algorithms professor says Dijkstra was wrong about most things, but I think many of Dijkstra's opinions are justified or right. But certainly it wasn't the right time for many of his remarks.
It's widely agreed that formal verification does not boost software productivity, in the sense that formal verification doesn't speed up development of "software that compiles and runs and we don't care if it's correct".
The point of formal verification is to ensure that the software meets certain requirements with near certainty (subject to gamma radiation, tool failure, etc.). If mistakes aren't important, formal verification is a terrible tool. If mistakes matter, then formal verification may be what you need.
What this and other articles show is that doing formal verification by hand is completely impractical. For formal verification to be practical at all, it must be supported by tools that can automate a great deal of it.
The need for automation isn't new in computing. Practically no one writes machine code directly, we write in higher-level languages, or rarely assembly languages, and use automated tools to generate the final code. It's been harder to create practical tools for formal verification, but clearly automation is a minimum requirement.
Automation of Hoare logic is quite good these days. Dafny, from MS Research (https://dafny.org), is probably the most friendly formal language of any kind. It's built around Hoare logic and its extension, separation logic. The barrier of entry is low. A seasoned imperative or functional programmer can get going with Dafny in just a few days. Dafny has been used to verify large systems, including many components of AWS.
I am hoping that LLMs make more advanced languages, such as Liquid Haskell or Agda, less tedious to use. Ideally, lots of code should be autocompleted once a human provides a type signature. The advantage of formal verification is that we can be sure the generated code is correct.
It is hard to understand what you mean by type signature, but I think what you mean is something like the type signature in Java, Go or C.
That isn't what people are talking about when they talk about formal verification with a type system. They are talking about much more complex type systems that have equivalent power and expressivity to formal logic.
This intro is good at introducing the topic. Unfortunately, the correspondence between advanced type systems and logic is itself fairly abstract and uses lots of concepts in non-intuitive ways. This means that you can easily wind up in a situation where you can do lots of proofs about number theory using a tool like Lean4, but you don't really see the corresponding types (this is what happened to me).
With dependent types, if you really want to. (Or just with conventional generics or typeclasses, the difference between a ring and a monoid will show, and that's enough for many cases).
If you ignore syntax and pretend that the following is a snippet of Java code, you can declare that a variable x always holds an int, like so:
var x: int = y + 5
Here x is the variable being defined, it is declared to hold values of type int, and its initial value is given by the term y + 5.
In many mainstream languages, types and terms live in distinct universes. One starts by asking whether types and terms are all that different. The first step in this direction of inquiry is what are called refinement types. With our imaginary syntax, you can write something like:
val x: { int | _ >= 0 } = y + 5
Once again, x is the variable being defined, it is declared to always hold a value of type int at all relevant instants in all executions, and that its initial value is given by the term y + 5. But we additionally promise that x will always hold a non-negative value, _ >= 0. For this to typecheck, the typechecker must somehow also confirm that y + 5 >= 0.
But anyway, we have added terms to the previously boring world of types. This allows you to do many things, like so:
val x: int = ...
val y: int = ...
val z: { int | _ >= x && _ >= y } = if x >= y then x else y
We not only declare that z is an integer, but also that it always holds a value that exceeds both x and y.
You asked for the type of a function that multiplies two numbers. The type would look weird, so let me show you an imaginary example of the type of a function that computes the maximum:
val f : (x : int) -> (y : int) -> { int | _ >= x && _ >= y } = ...
This doesn't really get you to the maximum, because f might be computing max(x, y) + 5, but it does show the idea.
The final step in this direction is what are called full-blown dependent types, where the line between types and terms is completely erased.
True, but also the work can be reduced significantly with better tooling, which is still being developed but has improved markedly over the past decade. Eg SMT solvers that can output proofs, or tactics in Coq or Lean.
I'm hoping that this will be a big application of AI actually. If an AI can be built do to this simple but very tedious work, and your verification tool is capable of catching any errors it makes, then you've covered up a major flaw of formal verification (its tediousness) and of AI (its tendency to output bullshit).
Proving safety is just a small part of the problem to be solved. The hard part is actually structuring the program such that its correctness can even be formulated as a formal property which can be proved. For a lot of software that alone is highly nontrivial.
> More seriously, even perfect program verification can only establish that a program meets its specification. The hardest part of the software task is arriving at a complete and consistent specification, and much of the essence of building a program is in fact the debugging of the specification
On the other hand, proofs sometimes give you more than you'd expect. A proof that the implementation of a function always returns some value automatically proves that there's no arbitrary code execution, for instance.
Why should it save labor. What it does is guarantee correctness which is like the holy grail for programmers. If you know shit's correct you can just assume stuff and it will actually work on the first try. You don't even need to write tests for it.
> What it does is guarantee correctness which is like the holy grail for programmers.
Formal verification can never guarantee correctness. For that to happen, you'd need to know that the property you verified is the property you want. If you have the ability to write correct specifications, you also have the ability to write correct programs, and running the verification doesn't add any value.
It guarantees correctness relative to a given requirement.
The real value of formal proofs lies in forcing you to think deeply about the requirement and your implementation of it and to make your assumptions about it explicit.
I have only ever used proof once in my career. We had a problem with an aircraft main computer that it was occasionally failing during start up and then refusing start again on all subsequent restarts. It was a multiprocessor computer and each processor was running start up tests some of which would interfere if run at the same time. I was worried that if the start-up was stopped at an arbitrary time it might leave the control variables in a state that would prevent further start-ups. I used Leslie Lamport's first technique (https://lamport.azurewebsites.net/pubs/pubs.html#proving) in an attempt to prove that it would always start up no matter at what point it was stopped the previous run. But I was unable to complete the proof in one situation. So I added a time delay to the start-up of some of the processors to ensure that this situation never occured. But that didn't fix the original problem. That turned out to be a hardware register being read before it had settled down. I fixed that later.
But you do know the specification, or at least parts of it in many cases.
In the upstream example, one specification is that some reasonable time after you turn on the airplane, the flight controller is ready to fly. That's a very reasonable spec and important to verify in the given story. It is a thin end of the wedge for a lot of real-time guarantees that you might like to make.
As another instance where the specification is known, what is the specification of compiled object code? The specification of compiled code is the source code and proving that the object code matches the source code is a way to protect against adversarial tool chains. Check out seL4 for an example of this kind of formal verification.
Memory safety is another property susceptible to formal verification. The Rust community is making use of this very nicely.
The net lesson is that even if it is impossible to get formal specifications for lots of things, there are lots of other things where you can define good specifications (and you might be able to prove you meet them, too!).
Correctness is taken to mean "The program matches its specification". Or, more literally, "the program is a correct implementation of it's specification".
Taking the reductio ad absurdum, the program itself perfectly specifies its own behavior. Yet it's hardly something I'd point to as a holy grail. We want programs that do what we want them to, which is generally a human construct rather than a technical one. (And you'd better hope that two different people don't expect the program to do two different things in the same situation!)
There are other ways to exploit formal verification. For example, you may be able to formally prove that the unoptimized version of your program has the same behavior as the optimized version, subject to specified constraints. This can assure that there aren't any bugs in the optimization that affect program correctness. You're eliminating a class of bugs. Of course, it's possible that the unoptimized version is incorrect, but at least you can assure that your aggressive optimization approach didn't cause breakage. Such approaches are commonly used for hardware verification.
Huh, this is kind of like other trivial objects in mathematics (typically the empty object and the universal object). One trivial property is "this program computes what this program computes" while the other trivial property is probably "this program computes something".
Although these are themselves only trivial in a programming language in which every string represents a valid program.
The program includes a lot of incidental complexity which would not be part of the spec. For example, for a sort function, the spec would be the "array sorted" postcondition. The code OTOH could be arbitrarely complex.
Good software, even with bugs, released today, can be used today. Perfect software, released two years from now, cannot be used today, or next week, or next year. "Good now" beats "perfect later", at least now, but often it beats it later too.
While someone is working on "provably perfect", someone who isn't using that approach has released four versions that were not perfect, but were good enough to be useful. That person took the market, and when the provably perfect version finally comes out, nobody cares, because they're used to how the other one works, and it has more features. And if the "provably perfect" person wants to implement those features, they have to change the program, which means they have to re-do the proof...
Also "perfect software" doesn't actually mean "perfectly robust systems" - at the end of the day it has to interact with the Real World, and the Real World is messy. You still have to cope with failures from cosmic rays, power brownouts, hardware bugs (or at least a difference in specification of the hardware to the model used for this sort of formal verification), failures communicating with other devices etc.
So critical software already has to deal with failures and recover, no amount of formal verification will remove that requirement.
> You still have to cope with failures from cosmic rays
Much like you have to cope with hash or GUID collisions... that is, you don't, because it statistically never happens. Unless you're speedrunning super mario or something.
Besides if you have a program that's formally verified, you just need to do what NASA did for its Apollo missions and make all the logic redundant and gate it behind a consensus algorithm.
You can argue that all 4 computers might get hit by a cosmic ray in just the right place and at just the right time... But it will never ever happen in the history of ever.
So my point is that the real world is messy. But the systems we design as engineers are not necessarily as messy. And the interface between the real world and our systems can be managed, and the proof of it is that we wouldn't be chatting across an entire ocean by modulating light itself if that weren't the case.
I've definitely dealt with hash/GUID collisions in the context of safety critical systems before. It's not a particularly uncommon requirement either.
"just" is pulling a lot of weight in your comment. Redundant consensus is difficult and expensive, all to address very particular error models (like the one you're assuming). If we expand our error model from localized error sources like cosmic rays to say, EMI, there are entire categories of fault injection attacks well-known to work against modern redundant systems.
That's assuming your specification is comprehensive and correct in the first place of course. My experience is that all specifications have holes related to their assumptions about the real world, and many of them have bugs or unintended behavior as well.
And unexpected input from other systems. And if it was unexpected in a way that wasn't covered by your proof, then your program is not guaranteed to work correctly on that input. And the real world has a lot of ways for input to do unexpected things...
But you don't have to do the proving. You can let someone who is, like, REALLY good at formal verification do it and then benefit from the libraries they produce
Verification is useful when you're dealing with low level or very declarative stuff. You don't really have to "verify" a CRUD repository implementation for a random SPA. But the primitives it relies on, it would be good if those were verified.
Same problem. Sure, it would be great if the primitives we use were proven correct. But I've got stuff to write today, and it doesn't look like the proven-correct primitives are going to arrive any time soon. If I wait for them, I'm going to be waiting a long time, with my software not getting written.
I always wonder, how do you prove your proof is correct? is it turtles all the way down?
Another way to phrase it is, how do you prove your formal verification is verifying the correct thing? from one point of view a proof is a program that can verify the logic of another program. how do we know the one is any more correct than the other?
Because a proof verifier has a fixed complexity, regardless of the proofs it needs to check. Also, a minimal proof verifier is of almost trivial complexity. More advanced and performant verifiers can bootstrap from the minimal one by provably reducing to it.
Software has such huge economies of scale that, for the most part, being useful or cheaper than the alternative surpasses the need to be correct. Only for some categories of software being correct would be a priority or a selling point.
Not specific to this article, but it's tragic that computer science curricula, and discussions of these algorithms, virtually never highlight the tight connection between binary search and long division. Long division done in binary is exactly binary search for the correct quotient, with the number written above the line (plus some implicit zeros) being the best lower bound proven thus far. Similarly, division done in decimal is just "ten-ary" search.
Neat. Division being iterative makes me feel better about last week when I couldn't think up a good closed-form solution for something and I didn't feel like computing gradients so I just made a coordinate descent loop. It'll be funny if it's still in there when it ships
I realized this in school but it was beaten out of me because making a wrong guess in the process was unacceptable.
Very cool observation, I never thought of that. Thanks!
I studied “computation” at Oxford 92-95 while CAR Hoare was still there and basically all they taught us about computers were these formal methods. They actually thought it would revolutionise programming and looked down at anything else as “hand-waving”. Think the syllabus was abandoned only a few years later thank goodness.
Interesting to hear your experience. I was there 94–97, when the curriculum was still pretty heavy on formal methods (and functional programming).
For me it was wonderful. I already knew how to write computer programs, as that's what I had spent most of my waking hours doing before that point, but learning how to think rigorously about the correctness of programs was a revelation to me.
I studied "Mathematics and Computation" there in 89-92 because they seemed to think that "computation" was a fad that was probably going away, so you couldn't let people study just "Computation".
There was a certain amount of formal methods, but only in a perfunctory manner, as if to satisfy an inconvenient requirement. Some functional programming, but in an extremely shallow way. Overall, I did not learn a single useful thing in 3 years.
I followed this up with Ph.D. in Computer Science somewhere else, which was also a complete waste of time.
Not a single useful thing? A complete waste of time? Really not anything that you found intriguing or thought-provoking for its own sake? I did the same course nine years later and found it very stimulating, even if it wasn't always directly "applicable" so to speak. But perhaps the syllabus had changed a bit by then.
What I learned from these comments sharing their experiences from one time to another is that the curriculum evolved and no one had the same experience.
Not to mention how individuals perceive things - two students with similar aptitude in the same class can still have their own very different experiences.
Just a heads up.
> I followed this up with Ph.D. in Computer Science somewhere else, which was also a complete waste of time.
It has been shown this form of thinking results in depression.
Thanks, friend. I currently don't have depression, but I'll see if I can work towards that.
Lol, thanks for that, was worried I was being too negative!
Still a strong presence in 98-01 but also lots of functional programming, algorithms and complexity and the tail end of the parallel research boom. I loved it.
A couple of decades ago, I attended a lecture by Tony Hoare about program correctness. He used the producer-consumer problem as an example and by starting with pre/post conditions and refining them, he arrived at a (proved) correct program. I went back from the lecture thinking this is so easy, I can apply this as well. Sadly, I realized one has to be Tony Hoare to produce code out of logic with his level of ease.
Sometimes I read a paper or see some code that makes me think that wizards are real and that they walk among us.
Apparently some aspects of Idris, Dafny, and (maybe) Isabelle are helping less-awesome logicians derive programs from their specifications nowadays.
(I haven't tried any of these languages so I don't have a personal story of how helpful they were for me.)
We actually proved correctness of the fast Long division of Scala.js using similar reasoning. [1] We had mechanical proofs for the other operations, but the solver couldn't handle division. The interplay between bit vector semantics and floating point semantics was too much for it to bear.
Scala.js has the fastest implementation of 64-bit integers on the JavaScript engine. The implementations cannot be understood unless you actually prove them correct. IMO that's one of the best reasons to invest in the formal proof of a program fragment.
[1] https://lampwww.epfl.ch/~doeraene/thesis/doeraene-thesis-201... section 4.4.2, page 114.
> The implementations cannot be understood unless you actually prove them correct
I quickly scanned the book you provided, but I couldn't find an explanation.
What do you mean by 'understood'?
> Interestingly, it seems division by zero is impossible, because there is no suitable remainder.
Excercise: However, this algorithm still terminates and computes something, when provided with d == 0. What does it compute? Strengthen the postcondition with this information and adjust the proof accordingly.
Related: https://toccata.gitlabpages.inria.fr/toccata/gallery/divisio...
i think this shows why formal verification, while certainly useful in specialized situations, will likely not be a major boost for software productivity[1]
[1] - https://worrydream.com/refs/Brooks_1986_-_No_Silver_Bullet.p...:
> I do not believe we will find the magic here. Program verification is a very powerful concept, and it will be very important for such things as secure operating system kernels. The technology does not promise, however, to save labor. Verifications are so much work that only a few substantial programs have ever been verified
My high school CS teacher in the 1990s had been influenced by this and told me that formal proofs of correctness (like with loop invariants, as here) were extremely tedious and that few people were willing to actually go through with them outside of a formal methods class.
This was a perennial topic of Dijkstra's as he complained about people writing programs without knowing why they worked, and, often, writing programs that would unexpectedly fail on some inputs.
But we're getting better tools nowadays. The type systems of recent languages already capture a lot of important behaviors, like nullability. If you can't convince the type checker that you've handled every input case, you probably do have a bug. That's already insight coming from formal methods and PL theory, even when you aren't thinking of yourself as writing down an explicit proof of this property.
I guess the compromise is that the more routine machine-assisted proofs that we're getting from mainstream language type checkers are ordinarily proofs of weaker specifications than Dijkstra would have preferred, so they rule out correspondingly narrower classes of bugs. But it's still an improvement in program correctness.
My algorithms professor says Dijkstra was wrong about most things, but I think many of Dijkstra's opinions are justified or right. But certainly it wasn't the right time for many of his remarks.
It's widely agreed that formal verification does not boost software productivity, in the sense that formal verification doesn't speed up development of "software that compiles and runs and we don't care if it's correct".
The point of formal verification is to ensure that the software meets certain requirements with near certainty (subject to gamma radiation, tool failure, etc.). If mistakes aren't important, formal verification is a terrible tool. If mistakes matter, then formal verification may be what you need.
What this and other articles show is that doing formal verification by hand is completely impractical. For formal verification to be practical at all, it must be supported by tools that can automate a great deal of it.
The need for automation isn't new in computing. Practically no one writes machine code directly, we write in higher-level languages, or rarely assembly languages, and use automated tools to generate the final code. It's been harder to create practical tools for formal verification, but clearly automation is a minimum requirement.
Automation of Hoare logic is quite good these days. Dafny, from MS Research (https://dafny.org), is probably the most friendly formal language of any kind. It's built around Hoare logic and its extension, separation logic. The barrier of entry is low. A seasoned imperative or functional programmer can get going with Dafny in just a few days. Dafny has been used to verify large systems, including many components of AWS.
I am hoping that LLMs make more advanced languages, such as Liquid Haskell or Agda, less tedious to use. Ideally, lots of code should be autocompleted once a human provides a type signature. The advantage of formal verification is that we can be sure the generated code is correct.
How do you encode the difference between a method that adds and a method that multiplies two numbers in the type signature?
It is hard to understand what you mean by type signature, but I think what you mean is something like the type signature in Java, Go or C.
That isn't what people are talking about when they talk about formal verification with a type system. They are talking about much more complex type systems that have equivalent power and expressivity to formal logic.
This intro is good at introducing the topic. Unfortunately, the correspondence between advanced type systems and logic is itself fairly abstract and uses lots of concepts in non-intuitive ways. This means that you can easily wind up in a situation where you can do lots of proofs about number theory using a tool like Lean4, but you don't really see the corresponding types (this is what happened to me).
https://leanprover-community.github.io/learn.html
With dependent types, if you really want to. (Or just with conventional generics or typeclasses, the difference between a ring and a monoid will show, and that's enough for many cases).
If you ignore syntax and pretend that the following is a snippet of Java code, you can declare that a variable x always holds an int, like so:
var x: int = y + 5
Here x is the variable being defined, it is declared to hold values of type int, and its initial value is given by the term y + 5.
In many mainstream languages, types and terms live in distinct universes. One starts by asking whether types and terms are all that different. The first step in this direction of inquiry is what are called refinement types. With our imaginary syntax, you can write something like:
val x: { int | _ >= 0 } = y + 5
Once again, x is the variable being defined, it is declared to always hold a value of type int at all relevant instants in all executions, and that its initial value is given by the term y + 5. But we additionally promise that x will always hold a non-negative value, _ >= 0. For this to typecheck, the typechecker must somehow also confirm that y + 5 >= 0.
But anyway, we have added terms to the previously boring world of types. This allows you to do many things, like so:
val x: int = ... val y: int = ... val z: { int | _ >= x && _ >= y } = if x >= y then x else y
We not only declare that z is an integer, but also that it always holds a value that exceeds both x and y.
You asked for the type of a function that multiplies two numbers. The type would look weird, so let me show you an imaginary example of the type of a function that computes the maximum:
val f : (x : int) -> (y : int) -> { int | _ >= x && _ >= y } = ...
This doesn't really get you to the maximum, because f might be computing max(x, y) + 5, but it does show the idea.
The final step in this direction is what are called full-blown dependent types, where the line between types and terms is completely erased.
> This doesn't really get you to the maximum, because f might be computing max(x, y) + 5, but it does show the idea.
Perhaps { int | _ >= x && _ >= y && (_ == x || _ == y) } ?
I just proved in Coq that if all of these always hold for a function, the function coincides exactly with the max function.
Basically weird type systems that encode pre/post conditions.
These ideas have not caught on for a reason. You end up writing the logic twice, and pretty high chances you mess up once.
True, but also the work can be reduced significantly with better tooling, which is still being developed but has improved markedly over the past decade. Eg SMT solvers that can output proofs, or tactics in Coq or Lean.
I'm hoping that this will be a big application of AI actually. If an AI can be built do to this simple but very tedious work, and your verification tool is capable of catching any errors it makes, then you've covered up a major flaw of formal verification (its tediousness) and of AI (its tendency to output bullshit).
Proving safety is just a small part of the problem to be solved. The hard part is actually structuring the program such that its correctness can even be formulated as a formal property which can be proved. For a lot of software that alone is highly nontrivial.
From "No Silver Bullet":
> More seriously, even perfect program verification can only establish that a program meets its specification. The hardest part of the software task is arriving at a complete and consistent specification, and much of the essence of building a program is in fact the debugging of the specification
On the other hand, proofs sometimes give you more than you'd expect. A proof that the implementation of a function always returns some value automatically proves that there's no arbitrary code execution, for instance.
Why should it save labor. What it does is guarantee correctness which is like the holy grail for programmers. If you know shit's correct you can just assume stuff and it will actually work on the first try. You don't even need to write tests for it.
> What it does is guarantee correctness which is like the holy grail for programmers.
Formal verification can never guarantee correctness. For that to happen, you'd need to know that the property you verified is the property you want. If you have the ability to write correct specifications, you also have the ability to write correct programs, and running the verification doesn't add any value.
It guarantees correctness relative to a given requirement.
The real value of formal proofs lies in forcing you to think deeply about the requirement and your implementation of it and to make your assumptions about it explicit.
I have only ever used proof once in my career. We had a problem with an aircraft main computer that it was occasionally failing during start up and then refusing start again on all subsequent restarts. It was a multiprocessor computer and each processor was running start up tests some of which would interfere if run at the same time. I was worried that if the start-up was stopped at an arbitrary time it might leave the control variables in a state that would prevent further start-ups. I used Leslie Lamport's first technique (https://lamport.azurewebsites.net/pubs/pubs.html#proving) in an attempt to prove that it would always start up no matter at what point it was stopped the previous run. But I was unable to complete the proof in one situation. So I added a time delay to the start-up of some of the processors to ensure that this situation never occured. But that didn't fix the original problem. That turned out to be a hardware register being read before it had settled down. I fixed that later.
> It guarantees correctness relative to a given requirement.
OK, but given that you don't know what that requirement is, how does that help you?
But you do know the specification, or at least parts of it in many cases.
In the upstream example, one specification is that some reasonable time after you turn on the airplane, the flight controller is ready to fly. That's a very reasonable spec and important to verify in the given story. It is a thin end of the wedge for a lot of real-time guarantees that you might like to make.
As another instance where the specification is known, what is the specification of compiled object code? The specification of compiled code is the source code and proving that the object code matches the source code is a way to protect against adversarial tool chains. Check out seL4 for an example of this kind of formal verification.
Memory safety is another property susceptible to formal verification. The Rust community is making use of this very nicely.
The net lesson is that even if it is impossible to get formal specifications for lots of things, there are lots of other things where you can define good specifications (and you might be able to prove you meet them, too!).
Correctness is taken to mean "The program matches its specification". Or, more literally, "the program is a correct implementation of it's specification".
Taking the reductio ad absurdum, the program itself perfectly specifies its own behavior. Yet it's hardly something I'd point to as a holy grail. We want programs that do what we want them to, which is generally a human construct rather than a technical one. (And you'd better hope that two different people don't expect the program to do two different things in the same situation!)
There are other ways to exploit formal verification. For example, you may be able to formally prove that the unoptimized version of your program has the same behavior as the optimized version, subject to specified constraints. This can assure that there aren't any bugs in the optimization that affect program correctness. You're eliminating a class of bugs. Of course, it's possible that the unoptimized version is incorrect, but at least you can assure that your aggressive optimization approach didn't cause breakage. Such approaches are commonly used for hardware verification.
Huh, this is kind of like other trivial objects in mathematics (typically the empty object and the universal object). One trivial property is "this program computes what this program computes" while the other trivial property is probably "this program computes something".
Although these are themselves only trivial in a programming language in which every string represents a valid program.
The program includes a lot of incidental complexity which would not be part of the spec. For example, for a sort function, the spec would be the "array sorted" postcondition. The code OTOH could be arbitrarely complex.
To be clear, the proper postcondition for a sort function is "the output array is a permutation of the input array and it is sorted".
how do you know the specification is correct? Its turtles all the way down here
Because it takes so much work.
Good software, even with bugs, released today, can be used today. Perfect software, released two years from now, cannot be used today, or next week, or next year. "Good now" beats "perfect later", at least now, but often it beats it later too.
While someone is working on "provably perfect", someone who isn't using that approach has released four versions that were not perfect, but were good enough to be useful. That person took the market, and when the provably perfect version finally comes out, nobody cares, because they're used to how the other one works, and it has more features. And if the "provably perfect" person wants to implement those features, they have to change the program, which means they have to re-do the proof...
Also "perfect software" doesn't actually mean "perfectly robust systems" - at the end of the day it has to interact with the Real World, and the Real World is messy. You still have to cope with failures from cosmic rays, power brownouts, hardware bugs (or at least a difference in specification of the hardware to the model used for this sort of formal verification), failures communicating with other devices etc.
So critical software already has to deal with failures and recover, no amount of formal verification will remove that requirement.
> You still have to cope with failures from cosmic rays
Much like you have to cope with hash or GUID collisions... that is, you don't, because it statistically never happens. Unless you're speedrunning super mario or something.
Besides if you have a program that's formally verified, you just need to do what NASA did for its Apollo missions and make all the logic redundant and gate it behind a consensus algorithm.
You can argue that all 4 computers might get hit by a cosmic ray in just the right place and at just the right time... But it will never ever happen in the history of ever.
So my point is that the real world is messy. But the systems we design as engineers are not necessarily as messy. And the interface between the real world and our systems can be managed, and the proof of it is that we wouldn't be chatting across an entire ocean by modulating light itself if that weren't the case.
I've definitely dealt with hash/GUID collisions in the context of safety critical systems before. It's not a particularly uncommon requirement either.
"just" is pulling a lot of weight in your comment. Redundant consensus is difficult and expensive, all to address very particular error models (like the one you're assuming). If we expand our error model from localized error sources like cosmic rays to say, EMI, there are entire categories of fault injection attacks well-known to work against modern redundant systems.
That's assuming your specification is comprehensive and correct in the first place of course. My experience is that all specifications have holes related to their assumptions about the real world, and many of them have bugs or unintended behavior as well.
And unexpected input from other systems. And if it was unexpected in a way that wasn't covered by your proof, then your program is not guaranteed to work correctly on that input. And the real world has a lot of ways for input to do unexpected things...
But you don't have to do the proving. You can let someone who is, like, REALLY good at formal verification do it and then benefit from the libraries they produce
Verification is useful when you're dealing with low level or very declarative stuff. You don't really have to "verify" a CRUD repository implementation for a random SPA. But the primitives it relies on, it would be good if those were verified.
Same problem. Sure, it would be great if the primitives we use were proven correct. But I've got stuff to write today, and it doesn't look like the proven-correct primitives are going to arrive any time soon. If I wait for them, I'm going to be waiting a long time, with my software not getting written.
The memory safety of the Rust standard library is an example of something where formal methods are bearing fruit already.
So you don't necessarily have to wait.
This is why connecting formal verification to an infinite (possibly-incorrect-)labor machine such as an LLM system can indeed save human labor.
I always wonder, how do you prove your proof is correct? is it turtles all the way down?
Another way to phrase it is, how do you prove your formal verification is verifying the correct thing? from one point of view a proof is a program that can verify the logic of another program. how do we know the one is any more correct than the other?
Because a proof verifier has a fixed complexity, regardless of the proofs it needs to check. Also, a minimal proof verifier is of almost trivial complexity. More advanced and performant verifiers can bootstrap from the minimal one by provably reducing to it.
Software has such huge economies of scale that, for the most part, being useful or cheaper than the alternative surpasses the need to be correct. Only for some categories of software being correct would be a priority or a selling point.
> will likely not be a major boost for software productivity
You can make plastic knives much faster and cheaper than metal ones. Production!