I have recently reviewed a journal paper that was describing very clever verification of a very sophisticated concurrent algorithm. I enjoyed the main ideas and the writing, and my only minor concern with respect to the work was the fact that the delivered result was giving very little insight to the potential clients of the algorithm on how to use it: the paper has merely proved that the algorithm is “correct” in some general sense. Therefore, the result was hardly compositional.

In response to this, the author cited Leslie Lamport’s ’97 paper *Composition: A Way to Make Proofs Harder*, which I wasn’t aware of, and which I later found to be quite thought-provoking read, even after eighteen years since it has been published.

For someone working in the field of programming languages theory (like myself) the title seems almost outrageous. Indeed, compositionality is our bread and butter: how can one argue that a way to reason about a program out of facts about its components is not a Good Thing? This is the reason we keep designing more expressive type theories and program logics, right? Turns out that the title is as misleading as it is provocative: the paper is criticising not the idea of compositionality itself (that would be extremely strange, indeed), but of *the idea of using program logics as a way to reason about programs*. At this moment a PL-versed reader should raise her eyebrows again: isn’t the compositionality a main goal of designing such logics, and what could be wrong about them? And, if so it happens that the compositionality provided by the program logics is broken, what is the alternative to using them?

A bit of the context will be useful now, so here is a quote from the paper:

Because computer systems can be built with software rather than girders and rivets, many computer scientists believe these systems should not be modeled with the ordinary mathematics used by engineers and scientists, but with something that looks vaguely like a programming language. We call such a language a pseudo-programming languages (PPL).

The author proceeds by arguing (convincingly) how reasoning in terms of PPL (and the corresponding logics) turns the proofs of simple properties of toy programs into clutter, and then shows how the same reasoning would be conducted in a much more clean way if one instead encoded the implemented algorithm as a traditional mathematical object, such as a state-transition system. This is concluded by the following remark:

Many computer scientists believe that their favorite pseudo-programming language is better than mathematics because it provides wonderful abstractions such as message passing, or synchronous communication, or objects, or some other popular fad. For centuries, bridge builders, rocket scientists, nuclear physicists, and number theorists have used their own abstractions. They have all expressed those abstractions directly in mathematics, and have reasoned “at the semantic level”. Only computer scientists have felt the need to invent new languages for reasoning about the objects they study.

That is, the crux of the paper’s criticism is that program logics as a verification tool are, according to the author, an unnecessary level of indirection, required to conduct proofs at the level of a PPL. In such quality, these logics obscure the proofs of interesting properties one should care about. The conclusion is, therefore, that one should stick to the reasoning directly in terms of an algorithm model instead of reasoning in terms of the program that implements the algorithm.

In my opinion, a valid analogy to such criticism would be to reject probability theory as obscuring reasoning about special cases of measure theory or, perhaps, abandon differential calculus in favor of computing function derivatives directly out of epsilon-delta definitions of limits. The point is, we still *need *more expressive program-tailored abstractions to efficiently prove facts about realistic code people actually write.

On a deeper level, in the ’97 paper, Lamport thinks of programs as of just a way to *run* mathematical objects. Luckily, we have since learned better. Nowadays, the common way of thinking about programs is as of *being mathematical objects* themselves. This view is advantageous because of at least two reasons, which I will elaborate upon below in my attempt to refute Lamport’s genuine criticism (program logics are bogus, so we should all just use plain math).

- Thinking about programs as of mathematical objects, that is, a direct subject of mathematical reasoning, stimulates researchers to find better ways of fixing what’s currently wrong with the existing program logics, while keeping the useful abstractions. This is, for instance, how Separation Logic has been discovered, making proofs about heap-manipulating programs as straightforward and modular as they should be. Another example of pushing the abstractions at the right level is addressing the problem of non-compositional Rely/Guarantee reasoning (indirectly mentioned by the discussed Lamport’s paper, Section 6.2) in the works on RGSep and SAGL. I’m sure that the reader can come up with more examples, well represented in the last two decades of the PL research on compiler correctness, verification of cryptographic schemes and other topics.
- On a more pragmatic side, not giving up on programs in favour of their “vanilla” semantics and developing expressive compositional logics and type theories that go beyond PPLs allows one to eliminate the
*representation gap*between a program and its model. This way one can verify exactly what the programmer has*implemented*, instead of what she*had in mind*. This approach has proven to enable impressive results, such as making ideas from Separation Logic to scale for analysing millions of lines of code in the recently announced Facebook Infer.

It will be fair to notice that the idea of using “purely mathematical” approach to program verification has been efficiently implemented in Lamport’s TLA+ tool, used recently with great success in industry as well. However, as admitted by the clients, what was verified is *specification*, not a *program*, therefore witnessing the “representation gap”.

A less obvious, yet important takeaway from the paper is that sometimes striving for *too much* of compositionality might be harmful as well, and this point I’m not going to argue with. In my experience of verifying concurrent objects, I was once trying to decompose the verification of a well-known readers-writers lock into specifying the “reader” and “writer” part separately. While not impossible, such separation of concerns required quite a lot of subtle dependencies to be defined between the logical modules, and it didn’t pay off at the end, as the two components were always used together. The lesson learned is that one should look for a reasonable “unit of compositionality” when conducting such verification in each particular case. It might be a function, a module, a class or, as in the described case, the whole concurrent library. I believe, some further empirical research in this direction will help to build the right intuition.