Theory and Practice: December 2008

17 December 2008

Array puzzle solution

Labels: puzzle

The problem was solved by Hadi Moshayedi.

The hard nut is, of course, initializing in constant time an array of size n. We can try to keep track of a set of indexes that are initialized. This reduces the problem to implementing a set of integers that supports make_empty, insert, and contains operations in constant time.

array_init: allocate an array of size n, make an empty set
array_set: set the element in the array and insert the index in the set of indexes
array_get: if the index is not in the set return the default value, otherwise read from the array

It remains now to see how to implement such a set. A hash-table is not good because it is randomized (and many implementations offer only amortized constant time). A bitmap does insert and contains in constant time, but we can't make an empty set quickly. A list of integers can do insert and make_empty in constant time, but we can't determine quickly if an integer is in the list. Can we get the best of both? Yes, with two tricks:

If we represent the list as an array plus a counter then we can index it quickly. So we might be able to do the contains operation in constant time if we know where to look.
In a bitmap we don't have to store only bits: We can store the information we need for the previous observation. And here's the crux: If we do so we do not need to initialize it!

I'll leave it here. If you still have trouble writing the code, let me know.

PS: Rustan said he vaguely remembers reading this in Knuth, but he wasn't sure.

Edit: Cosmin pointed me to a more clear explanation. (I didn't read it: It's a bit long.)

12 December 2008

Java and generics

Labels: coding

This is a question about how to use Java.

Let's say you abstract the notion of a tree in an interface.

interface Tree {

  public Tree getLeft();

  public Tree getRight();

}

Now you can implement various types of trees and put the common operations in only one place. But, the interface doesn't quite capture what a tree is, because you don't want mixed trees.

interface Tree<T extends Tree> {

  public T getLeft();

  public T getRight();

}

Now you can guarantee statically that the children have the same type as the parent:

final class IntTree implements Tree<IntTree> {

  private IntTree L, R;

  private int data;

  public IntTree(IntTree L, int data, IntTree R) {

    this.L = L; 

    this.data = data;

    this.R = R;

  }

  @Override public IntTree getLeft() { return L; }

  @Override public IntTree getRight() { return R; }

  public int getData() { return data; }

}

A utility class that works only for homogeneous trees looks like this:

class Trees<T extends Tree<T>> {

  public int countNodes(T t) {

    if (t == null) return 0;

    return 1 + countNodes(t.getLeft()) + countNodes(t.getRight());

  }

}

Question: What do I do if I want to have a generic operation that does something special for one type of tree? Usually one would inspect the runtime type, but the following doesn't work.

class TreeProcessor<T extends Tree<T>> {

  public int countOrSum(T t) {

    if (t == null) return 0;

    int result = countOrSum(t.getLeft()) + countOrSum(t.getRight());

    result += t instanceof IntTree? ((IntTree)t).getData() : 1;

    return result;

  }

PS: In my case, the special thing I wanted to do is run some debugging code that computes and prints some information. For "normal operation" there's no need to do anything special, because the operations really are "generic".

08 December 2008

Array puzzle

Labels: puzzle

I know this problem from Rustan Leino:

Assume that malloc(n) takes O(1) time and memset and calloc take O(n) time. Fill in the missing parts below:

struct array {
  //... data
};

// Creates an array that containes |size| copies of |default_value|.
struct array* array_init(int size, int default_value) {
  // ... code
}

// a[index] = new_value
void array_set(struct array* a, int index, int new_value) {
  // ... code
}

// return a[index].
int array_get(struct array* a, int index) {
  // ... code
}

All methods should return in constant time in the worst case, deterministically.

Send your solutions to radugrigore at gmail.

07 December 2008

Problem from a program verifier

Labels: puzzle, research

Problem. You are given the positive weights w₁,…, w_n and a limit W. Partition them using as few subsets of weight ≤ W as possible.

Question. I know a solution that takes Θ(3ⁿ) time. Can you do better? Can you find a lower bound?

01 December 2008

Strongly connected components, idiot

Labels: coding

I'm an idiot. The "proof" I gave for my SCC "algorithm" is wrong because DFS does not recursively explore all reachable unseen nodes: only unseen nodes reachable only thru unseen nodes. Sometimes I amaze even myself with the kind of mistakes I can make. I should have gotten used to it by now, right? Remember: Whenever you have a result that is too good to be true then it is too good to be true; you are better off trying to break it than convincing yourself that it works. Great! Now I'm talking to myself.

edit: A counter-example is scc({1:[2,3], 2:[1], 3:[]}). There should be two SCCs, not one as my code says. And, yes, I will continue with part 2 in which I'll describe Tarjan's algorithm. Hopefully his is better than mine :P.