June 29, 2012

Plagiarism

There's a political scandal in Romania. Obviously. There's always a political scandal, in any country, at any given time. This one, however, held my attention for more than a minute. So, it must be special.

It turns out that the prime minister is a plagiarist. There is no doubt. Anyone can look at his PhD dissertation and at the book from which much is copied. His party, however, cares more about the source of the accusation (which is likely the president) and fights back by ‘restructuring’ the academic committees that are supposed to deal with such situations.

As a general guideline, in such cases the thesis committee is shamed, but forgiven. Because it's their field, they should have noticed that the text is copied; but, realistically, every person has some blind spots. It could be that they indeed didn't notice the plagiarism. There is reasonable doubt.

As a hard rule, however, the author who plagiarised is both shamed and stripped of the PhD title. There is simply no excuse for copying someone else's work without acknowledgement. It is a textbook example of intellectual dishonesty. One can't copy so much text by mistake; it must have been on purpose. There is no reasonable doubt.

This kind of behavior is a violation of a fundamental principle upon which academic life is based. Specifically, people deserve recognition for the results of their intellectual work. Without a culture that observes such principles, it's very hard to have good researchers. Most good researchers come from the small fraction of people for which recognition is much more valuable than money. Most people, which by definition are not part of this small fraction, tend to simply not understand the previous sentence. Either that or they assume it's dishonest. Such people (most people!) simply can't and won't work on something that is sure not to benefit them financially, and has a very, very small chance of pushing the world forward. Ironically, such work underlies much of the cool stuff that surrounds us.

If you take away recognition, by creating a culture in which intellectual theft is OK, you take away a very powerful motivator. Those who would be good researchers choose other careers instead. You create an environment in which almost no new knowledge will appear. This is the problem. The petty political quarrel between the prime minister and the president is nothing, absolutely nothing, by comparison.

I want to live in a world in which knowledge is respected. I want my child to live in a world that knows more than today. I am grateful that, as a result of new knowledge, I enjoy things that my grandparents didn't even dream when they were children: TV, movies with incredibly realistic special effects, computers, printers, LCD displays, smartphones, affordable and easy to apply paint in bright colors, internet, credit cards, cash machines, non-stick frying pans, contact lenses, bar codes for fast checkout in a store, roller blades, video games, RADAR, digital photo-cameras, and so on.

Take a stand. Don't let politics destroy the chance for a good academic life in Romania.

April 5, 2012

Fixed-Point Magic: Successive Recursive Calls

A brief summary of the Y-combinator trick, and an unusual use.

You may be familiar with the Y combinator and its sexy uses in functional programming. The standard examples of use are logging and caching. Let's summarize how the trick works. Given a recursive function $f:a\to b$, you rewrite it into a non-recursive function $f':(a\to b)\to(a\to b)$: The body of $f'\;f$ (that is, $f'$ applied to its first argument which we name $f$) looks exactly the same as the body of $f$. At this point, types are already long, so I'll write $t^\circ$ for $t\to t$. For example the type of $f'$ is $(a\to b)^\circ$. The Y combinator has the type $\forall\alpha\beta,\,(\alpha\to\beta)^\circ\to(\alpha\to\beta)$. In words, Y takes a function like $f'$ and ‘ties the knot’ to give us back $f$. Before tying the knot, however, you may apply to $f'$ any number of transformers, which are functions of type $(a\to b)^{\circ\circ}$. Roughly, each transformer pre-processes the argument and post-processes the result.

Here is an example.

let rec y f x = f (y f) x
let fib' fib n = if n < 2 then n else fib (n - 1) + fib (n - 2)
let p_int x = printf "%d\n" x
let log p f' f n = p n; f' f n
let fib = y (log p_int fib')

The call fib 3 prints the following:

3
1
2
0
1

But suppose you want to print the call graph.

init -> 3
3 -> 1
3 -> 2
2 -> 0
2 -> 1

To do so, a transformer needs access to the current value of the argument, but also to the previous value. We will use a combinator successive of type $$ \forall\alpha\beta\gamma,\;(\alpha\to\gamma)\to(\alpha\to\beta)^\circ\to(\gamma\times\alpha\to\beta)^\circ $$ It transforms an untied function of type $(a\to b)^\circ$ into an untied function of type $(c\times a\to b)^\circ$, using a helper of type $a\to c$.

let successive g f' f (x, y) = f' (fun z -> f (g y, z)) y
let p_string_int (s, n) = printf "%s -> %d\n" s n
let fib = y (log p_string_int (successive string_of_int fib'))
let _ = fib ("init", 3)

Let's break down the types involved in the second to last line, because they can get long and may be hard to track in one's head.

fib' : (int -> int) -> (int -> int)
string_of_int : int -> string
successive :
  (int -> string)
  -> ((int -> int) -> (int -> int))
  -> (string * int -> int) -> (string * int -> int)
successive string_of_int fib' :
  (string * int -> int) -> (string * int -> int)
p_string_int : string * int -> unit
log :
  (string * int -> unit)
  -> ((string * int -> int) -> (string * int -> int))
  -> (string * int -> int) -> (string * int -> int)
log p_string_int (successive string_of_int fib') :
  (string * int -> int) -> (string * int -> int)
y :
  ((string * int -> int) -> (string * int -> int))
  -> (string * int -> int)
fib : string * int -> int

January 4, 2012

Seven Books that Everyone Should Read

If you want a solid background in the theoretical parts of computing, then you should read the following books. The best time to do so is in high-school and as an undergraduate. But never is too late.

Types and Programming Languages. You will enjoy this book if you like to manipulate symbols. Once you understand ordered sets and induction, you are ready to begin reading. You will learn at least the following:

  • Lambda calculus is the computational model of functional languages.
  • Well-typed programs don't fault (except when they do).
  • You construct more complicated types, such as variants, records, and function types, out of simpler ones.
  • Types are much like propositions in logic, and polymorphic types are much like propositions in second-order logic.
  • Mutable state is tricky.

Algorithms. You will enjoy this book if you like the problems used in Google Code Jam. Once you understand the notation $O(f(n))$ and you can solve the recurrence $x_{n+1}=2x_n+1$, you are ready to begin reading. You will learn at least the following:

  • Reductions are the algorithm design technique.
  • When you reduce an instance of a problem to smaller instances of the same problem, you say that you use recursion (or, if you want a fancier phrase, say ‘divide and conquer’).
  • If you cache the results of recursive calls, then you say that you use memoization.
  • If you use at most one recursive call, then you are greedy.
  • If you are stuck, then try to reduce your problem to linear programming, or some other problem that is famous and has good implementations.

Purely Functional Data Structures. You will enjoy this book if you puzzled over how to implement breadth-first search in a programming language without mutation. Once you know the basics of a functional language (such as OCaml or Haskell), you are ready to begin reading. You will learn at least the following:

  • Persistent implementations of heaps and of search trees are elegant.
  • Amortization is a technique for data structure design, although it is usually presented as a technique for algorithm analysis.
  • The requirements of persistence and amortization imply the need for lazy evaluation.
  • Number representations correspond to collection implementations.
  • Polymorphic recursion is useful.

Information Theory, Inference, and Learning Algorithms. You will enjoy this book if you like probabilities (or gambling). Once you can answer ‘What is the probability that the sum of a pair of dice is $\ge3$?’, you are ready to begin reading. You will learn at least the following:

  • Information equals entropy; it is measured in bits.
  • Communication systems are reliable because they use codes, which rely on redundancy.
  • Out of functions whose spectrum has finite support, those of type $\mathbb N\to\mathbb R$ carry as much information as those of type $\mathbb R\to\mathbb R$.
  • The bandwidth and the signal-to-noise ratio of a channel determine a fundamental upper limit on its capacity, known as the Shannon limit.
  • The codes that best approach the Shannon limit are based on sparse graphs.

The Art of Computer Programming. You will enjoy this book if you like detailed precision. Once you wrote a program in a low-level language, you are ready to begin reading. You will learn at least the following:

  • You must prove that your programs are correct and fast.
  • There are ${2n \choose n}/(n+1)$ binary trees with $n$ nodes; that is, $\sim 4^n/n$.
  • One of the best hashes of $x$ takes the high bits of $\lfloor 2^wa\rfloor x$, where $w$ is the word width, and $a$ is some irrational number from $(0,1)$.
  • BDDs represent boolean functions; ZDDs represent families of sets.
  • You can generate the permutations of $n$ objects using only a constant number of operations for each step.

Approximation Algorithms. You will enjoy this book if you like good-enough, fast code more than you like perfect, but nonexistent code. Once you finished the Algorithms book from above, you are ready to begin reading. You will learn at least the following:

  • You need not despair if you have to solve an NP-hard optimization problem.
  • Greedy algorithms are surprisingly good, but you should know examples on which they behave worst.
  • The approximation ratio compares the solutions produced by an algorithm to optimal solutions.
  • The greedy algorithm for Set-Cover has approximation ratio $O(\lg n)$, where $n$ is the size of the universe.
  • To design a minimization algorithm, search for a certificate that provides a lower bound.

Computational Complexity. You will enjoy this book if you like to know what computers can do in principle more than you like to know what computers really do. Once you understand why the halting of a program cannot be decided, you are ready to begin reading. You will learn at least the following:

  • ‘Can we solve a problem efficiently if we can check its answers efficiently?’ is the most famous open question in theoretical computer science.
  • We can efficiently find models for 3CNF formulas only if the answer to the famous question is ‘yes’.
  • Every problem that can be solved with reasonable amounts of memory can be solved (not much slower) by playing a game.
  • Some proofs don't teach you anything.
  • If you can quickly count models of CNF formulas, then you are good at playing games.

Disclaimer. I read only parts of these books.

Other lists. I wrote this post after I saw two other book lists: one for programmers (on a Romanian site popular among high-school students) and one for theoretical computer scientists (on CSTheory).


What other books would you put on this list? Which ones would you take off this list?

November 16, 2011

Squaregraphs

On Tuesday, Donald Knuth gave a talk about squaregraphs. I'm summarizing here what I learned, and I'm including the open problems that Knuth mentioned.

Squaregraphs are plannar graphs. When you draw a graph in a plane, each of its cycles partition the plane into two regions, one bounded and one unbounded. When the bounded region is empty (that is, no edge and no vertex is drawn in it), the cycle is called a bounded face; when the unbounded region is empty, the cycle is called an unbounded face. By definition, all bounded faces of a squaregraph are 4-cycles. There is one more condition. A vertex is interior if it does not belong to the unbounded face. All interior vertices must have degree $\ge4$. Here is an example, a gear graph.

Knuth told us about many properties that are implied by this definition, usually without proof. I enjoy such talks. First, they give you a hawk's eye view of a subject; second, they make you think about the proofs. Some proofs are easy and some escape you during the talk and even days afterwards, but at least you get an idea of what is easy and what is not. What bothers me is that at no point did I feel the need to invoke the last property, that interior vertices have at least 4 neighbours.

But let's focus on what I did learn, rather than what I did not understand.

While on the bus, before the talk, Rasmus and I looked at the Wikipedia article on squaregraphs, which is very dense, and tried to make sense of it. I think it helped both of us to follow the presentation better. One of the last things Rasmus said was: "Squaregraphs ought to have an inductive definition." And, lo and behold, we found out what that definition is during the talk. Every squaregraph can be built by starting with a 4-cycle and repeatedly adding new faces, subject to the degree constraint: take a simple path of length 1, 2, or 3 that is part of the unbounded face and make a new 4-cycle by adding, respectively, 3, 2, or 1 new edges (drawn in the former unbounded face, that is, towards the outside).

Open problem: Is it possible, using this inductive definition, to construct squaregraphs that look like Mona Lisa when viewed from afar?

By the way, a frog can jump from an edge to an opposite edge across a bounded face, obviously. Here is a picture with all possible frog expeditions on an 8-gear.

Each expedition, depicted above with a curved green line, is a set of edges that cut the graph into two disconnected parts. We give vertices on one side a bit $0$ and vertices on the other side a bit $1$. If we do this for each expedition then each vertex gets as many bits as there are expeditions. To put some order on these bits we name the expeditions $0$, $1$, $\ldots\,$, $n-1$, so each vertex gets an $n$-bit string. In our example, there are 8 green lines so each vertex will get an 8-bit string.

These bitstrings are interesting because the shortest distance between two vertices equals the Hamming distance between the bitstrings of the vertices. We say that we found an isometric embedding of squaregraphs into an $n$-dimensional hypercube. In $n$ dimensions, squaregraphs look squarish.

Another interesting property of squaregraphs is that they are median graphs. A median graph is one in which all three vertices $x$, $y$, and $z$ have a median vertex $\langle xyz\rangle$ that is on some shortest path between $x$ and $y$, on some shortest path between $y$ and $z$, and on some shortest path between $z$ and $x$. An example is the integer grid, with vertices $(m_x,m_y)$ for all integers $m_x$ and $m_y$, and edges between vertices $m$ and $n$ when $|m_x-n_x|+|m_y-n_y|=1$. The $x$-coordinate of $\langle mnp\rangle$ is the median of the array $[m_x,n_x,p_x]$, hence the name (I guess). It turns out that you can also isometrically embed median graphs in $n$-dimensional hypercubes. That is, you can label vertices with bitstrings such that travelling an edge flips exactly one bit.

A subset of the $n$-dimensional hypercube is a median graph exactly when the selected coordinates are the models of a 2-CNF formula. Knuth said that an open problem would be to find a nice condition to impose on 2-CNF formulas such that their models correspond to squaregraphs.

Notice that on squaregraphs we have a notion of a line (the green equivalence classes above) and also of turning left or right. This means that we can define a knight's move on the bounded faces of a squaregraph. How can we generalize various problems that are usually done on chessboards to squaregraphs? For example, can one define something like rook polynomials for squaregraphs?

Finally, is there a 3D generalization of squaregraphs?

For more information, Knuth recommended a paper by Bandelt, Chepoi, and Eppstein, which is also recommended by Wikipedia. It's quite long, though. Knuth also showed us some nice pictures of squaregraphs on Eppstein's blog. Of this blog he said: "It is one of the best blogs in computer science. I do not read it, for otherwise I would not do anything else."

[Thanks to Rasmus for finding several mistakes in a previous version.]

November 10, 2011

Automata on Infinite Alphabets

There are several ways of extending the standard definitions of automata to infinite alphabets. Here I try to explain why the trivial extension (just use the same definitions) is not satisfactory and present a couple of alternatives.

The usual definitions go like this. Given is a finite set $\Sigma$. We say that $\Sigma$ is an alphabet and its elements are letters. Words are strings of letters, languages are sets of words. A finite automaton is a finite digraph whose arcs are labelled by letters, has a distinguished initial vertex, and has some vertices marked as being accepting. Each (finite) automaton determines a language, namely the set of those words obtained by concatenating the labels of a walk from the initial vertex to some accepting vertex. A language that can be described by a finite automaton is said to be regular.

Why restrict $\Sigma$ to be finite? What could possibly cause problems if we allow it to be bigger? Isn't the generalization straightforward?

Yes and no.

Yes, we could simply remove the restriction on $\Sigma$ and keep the same definitions. However, that would lead to situations that intuitively just don't feel right. For example, is the language $L=\{\,abc \mid c\in\{a,b\}\,\}$ regular? With the definitions above $L$ is regular exactly when $\Sigma$ is finite. There is something unpleasant about a language suddenly becoming non-regular, although its (declarative) definition does not change.

Recall the standard model for (finite) automata: a transition function $\delta:X\times\Sigma\to X$, where $X=V$ is the (finite) set of vertices, an accepting function $\alpha:V\to2$, and the initial function $\iota:1\to X$. There are at least three ways of extending the standard model: finite-memory automata aka register automata (Francez and Kaminski, 1994), pebble automata, and data automata. All three are surveyed by Segoufin (2006). Thinking back of the language $L$, the intuitive reason why it cannot be recognized by a finite automaton is that we cannot easily remember the letters we saw. So the natural thing to try is to add a (finite) memory to the automaton. This suggests distinguishing between the (not necessarily finite) set of states $X$ and the finite set of vertices $V$. For example, we could take $X=V\times\Sigma^k$ for some $k$. The automaton has $k$ registers. However, now the automaton is too powerful. For example, take $V$ to be the singleton set $1$, and $\Sigma$ to be the set of natural numbers $\mathbb{N}$, and $k$ to be $1$. Then $\delta$ could be essentially any function $\mathbb{N}\times\mathbb{N}\to\mathbb{N}$. We did not even require it to be computable! The class of languages recognized by such a device is clearly too big to be called "regular". This is why extra restrictions on $\delta$ are needed. Francez and Kaminski (1994) essentially required that all that $\delta$ can do with letters is (1) store them in registers and (2) compare them for equality.

Finally, Bojanczyk et al (2011) give what I think is a very cute extension of the standard model of (finite) automata.


Their model is still a triple $(\delta, \alpha, \iota)$ of functions with the types as given above. However, the extra restrictions are much more principled. Now $X$ and $\Sigma$ are required to be $G$-Sets (for some group $G$), and all the three functions are required to be invariant. (These concepts are explained in my previous post on group actions.) A language is now $G$-regular when it is recognized by an automaton for which both $X$ and $\Sigma$ have a finite number of orbits.