03 December 2010

More OCaml Pretty-Printing

How to print expressions with binary operators using as few parentheses as possible. In this post, we avoid parentheses that are rendered unnecessary by precedence rules.

Consider the following type.

type op = Plus | Times
type expr = Op of op * expr list | I of int

And here is a value:

let v = Op (Plus, [I 3; Op (Times, [I 5; I 7; Op (Plus, [I 4; I 2])]); I 2])

If we don't care about how many parentheses are printed, we'd write

open Format

let op2s = function Plus -> '+' | Times -> '*'

let rec pp_list sep pp ppf = function
  | [] -> ()
  | [x] -> fprintf ppf "%a" pp x
  | x::xs -> fprintf ppf "%a%c%a" pp x sep (pp_list sep pp) xs

let rec pp ppf = function
  | Op (o, es) -> fprintf ppf "@[(%a)@]@," (pp_list (op2s o) pp) es
  | I x -> fprintf ppf "%d@," x

let _ = pp std_formatter v

The rule for precedence is simple: It's safe to strip parentheses in $(A \circ B) \bullet C$ if and only if $\circ$ has higher precedence than $\bullet$. It would be nice to isolate the code that decides whether parentheses are needed. We begin by removing grouping concerns from pp.

let pp r ppf = function
  | Op (o, es) -> pp_list (op2s o) r ppf es
  | I x -> fprintf ppf "%d" x

We can recover parentheses everywhere with a simple Y-like combinator.

let rec y1 f ppf x = fprintf ppf "@[(%a)@]@," (f (y1 f)) x
let pp1 = y1 pp
let _ = pp1 std_formatter v

Now pp1 does (almost) the same as the initial pp, the big difference being that we isolated the code that handles grouping. Because of this, implementing a policy based on operator precedences is simply a matter of defining another combinator.

let precedence = function 
  | Op (Plus, _::_::_) -> 1
  | Op (Times, _::_::_) -> 2
  | _ -> 0

let rec y2 up f ppf x =
  let down = precedence x in
  let lp, rp = 
    if down <= up && up <> 0 && down <> 0 then ("(", ")") else ("","") in
  fprintf ppf "@[%s%a%s@]@," lp (f (y2 down f)) x rp

let pp2 = y2 0 pp

let _ = pp2 std_formatter v


rgrig said...

I finally found the paper from which I first learned similar tricks: It was McAdam, Y in Practical Programs, 2001.

Rasmus said...

You know I would prefer to use the ordinary fixed point combinator:

let rec y f x = f (y f) x

and rename pp to pp' to indicate it is untied. I would also rename the argument r, so it is more clearly the recursive call, like so:

let pp' pp ppf = function
| Op (o, es) -> pp_list (op2s o) pp ppf es
| I x -> fprintf ppf "%d@," x

Now, we can implement the modification as a transformer. With the given definition of precedence, you can define

let paren f f_par ppf (x, up) =
let down = precedence x in
let lp, rp =
if down <= up && up <> 0 && down <> 0 then ("(", ")") else ("","") in
let f' ppf' x' = f_par ppf' (x', down) in
fprintf ppf "@[%s%a%s@]@," lp (f f') x rp

and then when you tie the fixed point, you get a function which expect two parameters:

let pp_par = y (paren pp')

let pp ppf x = pp_par ppf (x, 0)

All very nice and reusable :-)

rgrig said...

Thanks, I agree your version is nicer: It is only slightly longer, but it makes it possible to stack 'transformers' like paren.

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.