Theory and Practice: ICFPC 2005

The task was to construct a robber-bot and a cop-bot to play a game with given rules. You are given a (fixed) digraph on which a robber and five cops play alternatively: the robber makes a move, then each cop makes a move. The game ends when the cops catch the robber or after 200 moves (counting robber moves and cop team moves). The graph has 8 special nodes: 6 banks, robber start, and police headquarters. Each bank starts with 1000 (money). Eight moves after a robbery takes place each of the other banks transfer one sixth of their money to the robbed bank. The police headquarters is the start position for all cops and also the place where they can switch from going on foot to using a car. (The robber does not have a car at all.) The digraph has two types of edges: foot and car. Foot edges can be traveled in any direction by foot and only in their designated direction by car. Car edges can only be traveled by car. The robber knows the position of the other cops. The cops do not know where the robber is but they can smell it if it is nearby. Also, every eight turns the robber leaves an evidence that dissapears after 24 moves. If a cop finds the evidence it (I use it for robots -- it is also gender neutral :) ) knows when the robber was on that node. The cops can communicate each other what they know (they can lie) and can collaborate to apply a common plan (if they wish to do so). For details see the full description on the official site.

The bots had to be implemented as programs that communicated through stdin and stdout using a line-based protocol given in the spec with a server. The server was written by the organizers and you could also use it to visualize what is going on.

How did I worked?

For 20 minutes I coded (or starred, for that matter) at the computer and then I took 10 minutes break. If midday the go take a bicycle park tour else if nine o'clock in the evening go get some sleep else go to the beginning of the paragraph. Why so many breaks? Mainly because of my eyes.

What have I done?

A cop and a robber, that is I managed to submit something. On Friday evening, about 3h after the task was announced I had a random-walking robber, which I have implemented just to make sure I understood the protocol description correctly. On Saturday I coded a robber that actually wanted to rob banks and run from cops. On Sunday I spent half a day looking at walls and thinking how to make a smart robber. At midday I realised I ain't gonna get any help with the cop so I started to write one. In the evening it wasn't working because of a few bugs that I fixed Monday morning. Then I submitted and went to work, where I felt helpless trying to catch a bug in some code that was just too big for me to understand (and extremely badly indented; in fact I think it wasn't indented at all: just had some random number of spaces at the beginning of each line). I didn't tested the cop too much so I may get disqualified for not following the protocol.

Let me tell you about the robber. It has a bank target it wants to reach. If it can't get closer to its target without going too close to a cop then it changes it target. If it can't get closer to any bank then it panics and just runs as far as possible from the closest cop. This was caught by the cops produced by radeye's team in 27 moves :( There are too small problems with my robot and a big one. First it was to bold. I've intentionally let the robber get into smelling distance of a cop to give him more movement freedom. Second, if it is surrounded, then the farther it can get from cops at any one point in time is to just sit like a duck. But the big problem is that it does not look into the future at all. I had intended to code a future-looking robber that uses a heuristic borrowed from physics to asses individual positions but didn't code it (except the heuristic -- but it is not used). The main difficulty (for me) was that alpha-beta prunning is not applicable to this game since players do not have perfect knowledge and hence cops will clearly make sub-optimal moves most of the time. Alpha-beta is not good against a sub-optimal player. I couldn't see a natural way to modify the algorithm to fit this situation. Of course, I could have used many heuristics but none seem to make sense, except on an intuitive level.

Now that you are acquinted with my robber artificial stupidy you are (I hope) prepared to meet the even greater artificial stupidity of my cop.

It's main feature is that it does not collaborate well... at all. (should I put a comma before "well" :) ? ). It's not because I'm an individulistic person: I'm not. It's just that I didn't had time to code it. (how I managed to get time to look at walls for at least 3h in this situation I don't know). Then what does it do? It uses the informations it finds and what others share to compute probabilities of the robber being in each node. Then chooses six most probable locations and computes what is the best way for cops to get there (again, doesn't really look into the future since this is assuming the robber stands still :p). Then shares its plan and follows it regardless who's plan wins the vote of the cops. Ah, I forgot, it always votes for himself.

That is what I did. But why did I performed so poorly?

I guess the main reason is that I did not had a plan. Not only my robots are short sighted but I am too. (Actually the name of my bots is "blind". At the time I baptized them I was thinking about my extremely fatigued eyes, but now I guess it best describes their behaviour of not anticipating). It is OK to start coding to get a better feel of the problem if you think you did not quite understood/remembered it. But it is not OK to keep switching every couple of hours from thinking about architecture/design to coding to thinking about cop heuristics to thinking about parsing to thinking about robber heuristics, etc. Well, this switching is not that bad in itself, but it did contribute to the aimless feeling that I got on Sunday that almost convinced me to give up.

So, have an overall plan, focus on one thing at a time in a top-down approach. Take breaks and experiment with smaller bricks before you decide on interfaces when you aren't quite sure what you can use or how you can implement something. In short: practice and have a plan.

One other thing: for longer time tasks I seem to have the tendency to pace myself to finish just before the deadline (or just before going to work in this case). Why do I do that? It may be true that if I don't hurry I do a better job. But... there are two buts: (1) for this task instead of being more careful and thinking more focused I spent time "resting" my brain, and (2) for some tasks, including this one, where you can't find a perfect solution the best approach is to try many approaches -- not just a "good" one.

BTW, check this out!

Theory and Practice

28 June 2005

ICFPC 2005 - my story

No comments:

Post a Comment