## 07 February 2017

### POPL 2017

Some things I learned by attending POPL talks.

Monadic Second Order Logic on Finite Sequences. After two weeks, I do not quite remember what was the main contribution. I do remember that during this presentation I finally understood symbolic automata. As you know, traditional formalisms, such as finite automata, work over finite alphabets. If we want to use such formalisms to talk about programs, we have a problem. Often, we want the letters to be (references to) objects, but the number of objects is unbounded. Other times, we want letters to be integers, but the number of integers is also unbounded. A solution is to take an existing formalism, such as finite automata, and extend it to work over infinite alphabets. One way to do so is to use nominal automata. Roughly, instead of considering just an alphabet $\Sigma$, we also consider a group $G$ that acts on the alphabet, and has a finite number of orbits; for example, we could take all permutations, $G={\rm Sym}(\Sigma)$. Then, nominal automata let you define a language $L$ as long as it is invariant under $G$'s actions: $l_1\ldots l_n \in L$ iff $\pi(l_1)\ldots \pi(l_n) \in L$, for all $\pi \in G$. Register automata offer a concrete way to think about nominal automata. In register automata, you have one extra action – you can store the current letter in a register, and you also have an extra guard – you can test if the current letter equals the letter in a register.

An alternative to nominal automata is given by symbolic automata. Intuitively, nominal automata allow you to state requirements that are simple (they only involve equality), but can talk about several letters in the input word. Symbolic automata do the converse: You are allowed to use an arbitrary logic to express your tests, but you may only refer to the current letter when doing so; in particular, there is no moral equivalent of the memory. When I say ‘arbitrary logic’, I mean that the definition of symbolic automata is modular – you can choose your logic later and you should do so explicitly.

Now back to the main result. As I said, I didn't quite get it (or, possibly, I forgot in the two intervening weeks). But, it was something of the following form. Büchi, Elgot, and Trakhtenbrot (1957,1958) showed that MSO over finite words defines regular languages, by giving a procedure which builds an NFA from an MSO formula. So, if somebody gives you an MSO formula, you can say whether it is satisfiable by building the NFA, and checking if its language is nonempty. Loris D'Antoni and Margus Veanes define a sort of MSO(T) – MSO modulo what you're allowed to require of the current letter – and then explain how to build a symbolic automaton for that. Thus, one can check satisfiability of MSO(T), assuming that satisfiability(?) in T is decidable. (BTW, here's a nice set of slides by Moshe Vardi, which mention the result from 1957: Logic, Automata, Games, and Algorithms.)

My understanding was also helped by a chat I had with Margus prior the talk. I bombarded him with questions and he patiently answered all. I also had a chat with Loris, but mostly about Automata Tutor. I'm happy to report that my son (almost 8 years old now) started this week to solve problems from that website. He thinks it's fun! (But his approach is somewhat too random for my taste … although I occasionally observe a spark in his eyes, followed by what seems to be the execution of a plan.)

LOIS: Syntax and Semantics and Learning Nominal Automata. LOIS stands for looping over infinite sets. It is an extension of C++ that lets you do what the name says. Suppose that someone gives you a finite graph $G$ and asks if it is connected. It is possible to answer this question with a simple program, which checks if each vertex $x \in V(G)$ can reach all vertices $V(G)$ of $G$; that is, ${\it connected}(G) := \bigwedge_{x \in V(G)} \bigl({\it reach}(x) = V(G)\bigr)$, where $\it reach$ is implemented by BFS or DFS. I'm not going to write the code, because it's hopefully clear what I mean. If we try to run the same code on an infinite graph, though, the program won't terminate. In fact, we'd also have a termination problem when trying to construct such a graph, although we could conceivable ‘solve’ it by some laziness. Yet, with LOIS you can use (almost) the same program you use for finite graphs, and it will terminate. For example, you can construct a countably infinite random graph, which contains every finite graph as a subgraph, and ask whether it is connected. How does it work? It does symbolic manipulations. So, not quite any infinite set works: they must be definable (although I didn't quite get in which logic). In particular, sets must be countable. This restriction is what makes it impossible (unfortunately ☺) to solve undecidable problems such as universality of register automata.

You can play with LOIS's implementation.

Finding out whether a random graph is connected seems like an artificial example. Are there any real applications? It depends what you mean by ‘real’. But, I can say that there was at least one (other) POPL paper can be viewed as an application: Learning Nominal Automata. Angluin's algorithm (1987) learns regular languages by using two queries: (1) ‘Is this word in the (secret) language?’ and (2) ‘Does this automaton represent the (secret) language?’ Nowadays, as I mentioned, we care about automata that work over infinite alphabets. It turns out that one can use Angluin's algorithm unchanged (or at least without major modifications – I'm not sure) to learn nominal automata. The authors do not use LOIS, but another language NLambda. It also has infinite sets but, as the name implies, it's functional.

One of the authors of LOIS is Eryk KopczyĆski, whose code on Topcoder I used to read regularly more than ten years ago. Eryk mentioned this puzzle: Assume that P=NP. Input: An NP-complete problem $p$. Output: An algorithm that solves (any instance of) $p$ in polynomial time.

Component-Based Synthesis for Complex APIs. You have access to some existing functions, and you have to implement a function whose signature is given. Often, a straight-line program would do. This paper shows how to automate the task of finding such straight-line programs, by doing a search guided by types. More specifically, the approach is to count. Suppose you want to write a function with the type ${\rm string}\to{\rm int}$, and you have at your disposal two functions: \begin{align*} f &: {\rm string} \to {\rm string} * {\rm string} \\ g &: {\rm string} * {\rm string} * {\rm string} \to {\rm int} \\ \end{align*} You can call $f$ twice and then $g$: \begin{align*} &\{\,{\rm string}\,\} \\ &f \\ &\{\,{\rm string} * {\rm string}\,\} \\ &f \\ &\{\,{\rm string} * {\rm string} * {\rm string}\,\} \\ &g \\ &\{\,{\rm int}\,\} \end{align*} I chose this notation to get across an observation made by Hongseok Yang: the method has some similarities to separation logic. Of course, just counting types has a problem: In the example above, the second call to $f$ has two strings on which we could call it, and just by looking at types there's no way to choose. Similarly, the last call to $g$ needs to fix some order for the $3$ arguments, and types provide no guidance. In such cases, a human would need to intervene. The hope is that types are sufficiently precise to make such situations rare.

The presentation used a formulation in terms of Petri-nets, and the synthesis problem as a reachability question. This is a bit of a red-herring, because it makes it sound like the complexity of synthesis is horrible. However, the synthesis works in a similar way as bounded model-checking: it puts a bound on the length of the program and tries to find a solution; if it fails, it increases the bound and repeats.

Others. I also enjoyed other presentations, but I don't want to make this post too long. (Or, rather, I don't want to spend too long on this post.) These other presentations include: Thread Modularity at Many Levels: a Pearl in Compositional Verification (or, ‘how to avoid auxiliary variables for the benefit of automation’), Polymorphism, subtyping and type inference in MLsub (in which types get rather flexible), Exact Bayesian Inference by Symbolic Disintegration (sounded a bit like abstract nonsense, but was very entertaining), Coupling proofs are probabilistic product programs (from which I learned a card ‘trick’). Also, I was happy to see friends; for example, Hongseok: