16 August 2008

Language design

This post gives a subjective and high-level view of how languages, tools, and programmers interact.

We start with a small set of primitives that has pleasant aesthetics and provable power. Beauty implies symmetry, regularity, lack of special cases, generality. Power implies that a wide set of problems can be solved (efficiently) by combining primitives. Think of a nice assembly language with 10 instructions or so, or perhaps even something like MMIX.

There is no guarantee that solutions will be easy to find or that they will be small and elegant: We only know they exist. If the language is elegant, then programs written in that language are not automatically elegant. Nevertheless, this is the first step. We then solve problems we care about and stare at the ugly solutions we produce. No one solution is particularly ugly, but as set they are. We build libraries that encapsulate recurring patterns in the ugly solutions and rewrite those solutions into something less ugly. Adding a layer of abstraction helps. Some people say: "Let's step back. If abstraction helps then let's see what can be abstracted. How can we eliminate all redundancy? Can we abstract the process of abstracting?" I don't like this. I don't like one language to rule them all. I like variety. I like a small amount of redundancy. I like to solve existing problems, not potential future problems. The initial set of primitives addressed a problem domain: We took time to prove sufficiency. Libraries reduced redundancy that became apparent after writing solutions to a set of problems we were most interested in.

It's nice to have beautiful programs, but we should also think how we get there.

The set of primitives is nice if almost all possible combinations are potentially useful. Alas, it appears that this property is hard to maintain while going up the abstraction ladder: There are often combinations that do not make sense. If integers and integer division are primitives then x/y makes sense almost always but there is no sensible definition for x/0. The problem exacerbates for intrinsicaly longer combinations, such as those that use library abstractions. We are bound to combine constructs in nonsensical ways a few times while developing a program; more often than not, because our skull has limited volume. We can't pay attention to many details at the same time. It sounds like a job for a computer!

Tools can help essentially in three ways:

  1. A tool can try to find problem spots by looking at the program text. Type checkers do this. Some problems are really hard to spot this way, even if extra information is given by the programmer as annotations.
  2. A tool can introduce sanity checks in the code. That's how NullPointerException gets thrown in Java. (By the way, if you happen to believe that Java has no pointers, then can you please explain that name?) For both static analysis and runtime checks, combinations that are legal according to the language may still be meaningless for the programmer. Sure you can do x/2 any time, but sometimes you expect to have no reminder when you do it. You can either add in the language a `no reminder division' and use it, or keep saying x/2 and precede it by the annotation assert even x.
  3. The programmer introduces sanity checks and gets support for doing so, in the form of libraries, or code navigation utilities that point to good places to insert sanity checks. People usually call this testing.

No comments:

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.