21 September 2006

Google Codejam 2006 Round 2

I managed a very bad performance in the latest round of GCJ, ending up on place 286. Even if I would have been in good shape I very much doubt that I would have been able to finish in the first 100 and qualify to the onsite. Long story short: the problem were nice so I want to share them and their solutions.

Easy

This problem was tricky, lots of people missed something. It said that you are given a string and you should decide whether it is a valid fraction of the form "/" such that (1) integers do not have leading zeros, (2) integers are not greater than 231, (3) integers are non-negative (i.e., no `minus' sign), (4) the denominator is not zero, and (5) no extra characters (such as spaces) are allowed. For example "3/4" is valid, while " 0/0" is guilty on two counts (has space and the denominator is zero). You should return the index of the first character that makes the input bad, or -1 if the input is valid.

During the match I implemented an automaton, very similar to what is used to detect a regular expression.

Medium

The medium problem gives an integer with at most 50 digits (as a string with no leading zeros) and asks for the minimum number of digits that need to be changed (digits to the left can be changed into zeros) such that the resulting number is a multiple of m, which is at least 1 and at most 50000.

This problem can be solved by first writing in mathematical language what is being asked: we want to know the minimum number of digits to change in x such that x=0 (modulo m). For problems that involve multiplication and very big numbers it pays of to look at the digits one by one. So we write x=10x'+d, where d is the last digit of x. The condition then becomes 10x'=-d (modulo m). This suggests one solution. First we generalize the problem to finding how many digits should be changed in x such that x=a (modulo m). Then we can try all possibilities for d (keeping track if that means a changed digit or not) and check how many digits should be changed in x' such that x'=(a-d)/10 (modulo m), which is the same problem. All that is left is figuring out the base case and computing division by 10 modulo m. Both are easy tasks.

Here is the relevant part of my code.

  int getMin(string t, int m) {
    int n = t.size();
    int i, j, k, d;

    vector<vector<int> > inv10(m);
    for (j = 0; j < m; ++j)
      inv10[(10 * j) % m].push_back(j);

    for (i = 0; i < n; ++i) for (j = 0; j < m; ++j)
      c[i][j] = INF;

    int msd = t[0] - '0';
    for (j = m - 1; j >= 0; --j) {
      for (d = 9; d >= 0; --d) {
        if (d % m == j)
          c[n-1][j] = min(c[n-1][j], int(d != msd)) ;
      }
    }

    for (i = n - 2; i >= 0; --i) {
      int dx = t[n-i-1] - '0';
      for (j = m - 1; j >= 0; --j) {
        for (d = 9; d >= 0; --d) {
          int nm = ((j - d) % m + m) % m;
          for (k = 0; k < inv10[nm].size(); ++k)
            c[i][j] = min(c[i][j], (d!=dx) + c[i+1][inv10[nm][k]]);
        }
      }
    }

    return c[0][0];
  }

Hard

This problem gives a rooted tree with weighted edges and a budget. It askes how many nodes can be connected to the root using that budget. The number of nodes is at most 100, while the budget is (an integer) at most 1000000000.

Since we are given a tree we have a natural way of breaking the problem into subproblems. Let's consider some arbitrary node n with children n1, ..., nq, the costs toward the children being respectively c1, ..., cq. Let us now denote the number of nodes in tree t that can be connected with a budget of b by N(t, b). How can we construct this number from N(ti, b')? It turns out we can't because there are just too many ways to split the money between the children. So, instead, we'll ask them how much money they need to connect a certain number of nodes. There are only 100 nodes so it should be easier. Let's then denote by B(t, n) the budget necessary to cover n nodes in the tree rooted at t.

We can now compute B(t, n) = minn1+...+nq=n sumi B(ti, ni) + [ni != 0] ci. This can be done efficiently by a simple dynamic programming, basically updating B(t, *) incrementaly after considering each child. The only remaining problem is to make sure we process the children first and then their parrents -- a classic job for DFS.

Here is the code. It contains some parsing necessary because the graph was given in a text. Just think of adj as being adjacency lists and the corresponding positions in cst represent costs. Also, m is the number of nodes, r is the index of the root, and bm is the available (maximal) budget.

typedef long long i64;
int m;
vector<int> adj[128];
vector<int> cst[128];

i64 b[128][128];

const i64 INF = 0x7fffffffLL * 100;


void dfs(int nn, int n) {
  int i, j, k;
  for (i = 0; i < adj[n].size(); ++i) if (adj[n][i] != nn) {
    dfs(n, adj[n][i]);

    for (j = m; j >= 0; --j) {
      for (k = 0; k <= j; ++k) 
        b[n][j] = min(b[n][j], b[n][j-k] + b[adj[n][i]][k] + !!k * cst[n][i]);
    }
  }
}

struct HighwayBuilding {
  int getMaxCities(int m_, int r, vector <string> gs, int bm) {
    m = m_;
    int i, j;
    for (i = 0; i < gs.size(); ++i) {
      replace(ALL(gs[i]), ',', ' ');
      istringstream iss(gs[i]);
      int f, t, c;
      while (iss >> f >> t >> c) {
        adj[f].push_back(t);
        cst[f].push_back(c);
        adj[t].push_back(f);
        cst[t].push_back(c);
      }
    }

    for (i = 0; i < m; ++i)
      for (j = 2; j <= m; ++j) b[i][j] = INF;
    
    dfs(-1, r);

    for (i = 1; i < m && b[r][i+1] <= bm; ++i);
    return i - 1;
  }
};

17 September 2006

Tutorials

This semester I will organize the tutorials for the Algorithmic Problem Solving course at UCD. It is a first year course. Here is a problem that a student might see on an exam.

A set of tumbles are laid on a table, some of them with the right side up and some upside-down. You are only allowed to reverse two at a time. What conditions should the initial configuration satisfy so that it is possible to get them all with the right side up?

I would like to explore some possibilities for teaching such material. But before I delve into that I must say that I'm not familiar with UCD customs or with how this course was taught in the past. So if I commit some kind of sacrilege be sure it's because of my lack of information, not because I am the revolutionary type. I believe in slow but steady change.

So, what is a tutorial? Here is what I found in the Collaborative International Dictionary of English.

A class or short series of classes in which one or more instructors provide intensive instruction on some subject to a small group. Such short courses of instruction may be held at an institution of learning, or in any other place where a small group may desire a brief but thorough introduction to a topic.

The first problem: "is it better to have one or more tutors in class?"; and a related one: "what is the optimum number of people per class?". Where I studied there was almost invariably only one tutor (or "teaching assistant" as it is called by Americans) that worked with a group of about twenty students. Here, at UCD, I've heard of at least one course where the tutorial sessions have more than one tutor present. This is an important difference since going from one to many tutors makes it impossible to have a homogeneous session. If there is only one tutor he or she can give coherence to the whole session by having a schedule and acting at some points as a lecturer and at other points as a moderator for a discussion in which the whole class participates. This is the style of tutorials I am used to. On the contrary, if there are many tutors present then the session tends to favour individual work by each student (or by small teams of students). The tutors simply act as sources of information that wonder around the room.

As you can probably guess I am a big fan of the one-tutor per session scheme. What do you think about it?

The second problem is: "for how long should the students be allowed to drive the session?". One extreme is to just act as a lecturer: from the beginning to the end the tutor shows the students how problems are `supposed' to be solved and what is the necessary theoretical knowledge. A variation is to ask a student from time to time to come to the board and solve a problem. The other extreme is to throw a problem at the class and let them talk it dead until they solve it.

The middle road that I prefer is to start by presenting some theory, pose a problem, act as a moderator for a class discussion while they are trying to solve it, and then, once the solution is found, formalize it on the board. Would you prefer the balance to be different? Why?

The last point that concerns me is: "how much time should the tutor spend deviating from the focus of the course?". The political correct answer would probably be "as little as possible". But, since I get the opportunity to talk to first year students, I feel compelled to spend a little amount of time touching on high-level issues concerning our profession. I would like to talk to them about ACM, I would like to have them read Dijkstra's humble programmer essay, some essays about what prospective employers are looking for, and in general about what the role of the college is. That's just too much stuff, though, that would clutter the technical material; and that should be the fun part; except it isn't, at least for first year students. Another way of deviating is by doing more than the course tries to teach the students.

So, should a tutor deviate by talking about high-level issues or by really pushing the students towards their limits? Will that scare them away?

13 September 2006

Blog readlist for September

A bit late but here it is