21 May 2008

A bit of history

This my statement of research interests that I wrote in 2005 to get a PhD position in Dublin, with some minor subsequent editing. Those of you thinking about writing such a thing might find it useful.

Statement of Research Interests

The taste for formal methods (in a broad sense) was instilled in me by working on translators for languages equipped with formal semantics and by reading Dijkstra's notes. It was refined by learning functional languages from places like Didier Remy's lectures on OCaml.

But my background/history should be relevant too. I will start in chronological order with facts that have little to do with this particular position but that, I think, help you form a better idea about me.

My first computer was built from parts by my father, Corneliu Grigore, when I was 6 years old. It was based on Intel 8080 and came with a BASIC interpreter. It was a very pleasant experience for me to see that a machine made of metal and plastic can understand a rudimentary language and obey commands like DRAW 10, 10, 30. That is when I first got acquainted with loops.

When I was about 12 years old my father brought me two books: "Pascal" and "C++". He said that these are much better programming languages than BASIC and that I should probably start with Pascal because it is easier. So I read the book. Pascal still is the only language about which I have read a whole book but never wrote a line of code! I did not have a computer with a Pascal compiler. Then I read the C++ book and, when I was close to the end, I finally got a computer with Turbo C++.

If you are already getting bored be assured: there is only one more story from the (very) early days.

This story shows how I fell in love with simplicity. One day my father (yes, him again) showed me a problem he had at work. It was a real problem but in simple terms he had an adjacency matrix in an Excel spreadsheet and wanted all-pairs shortest paths so he could optimize a telecom network. In about 2 days I learned enough Visual Basic (for Applications) to write a solution to this problem. It was a backtracking algorithm (please don't say "yuck" -- I haven't read any book on algorithms at that point). It worked. A few months later there was a problem: my father wanted to do the same thing for 40 nodes instead of 20 but the program did not work anymore. First he thought that maybe there is some hidden 20 constant that he forgot to change so he asked me to take a look. I discovered that the program actually worked, only extremely slowly. I have spent a week trying to find a fast enough solution (I was about 15 years old). I have failed. In the process I even acquired a strong belief that it cannot be done. But my father knew better than to give up. A week later he gave me book by Tannenbaum and said: "Look at this section. It describes something called Dijkstra's algorithm and I think it might help." I took the book, with distrust, and read that section. I was flabbergasted! How could I have missed something so simple and beautiful and obviously correct for one whole week? After a few hours of coding the program worked flawlessly.

It is a story that I remember vividly because it taught me many lessons and influenced me a lot more than many formal lectures I have attended. I have learned that you should do your research when you try to solve a problem. As Newton put it: "If I have seen further it is by standing on ye shoulders of Giants." I have learned not to give up so quickly when trying to solve a problem. More often than not, what you need is a different mindset. Finally I have learned that the good solution is often much simpler than you suspect. So when you try to solve a problem, try to do simple things.

That's quite enough background on my early days. What have I done in college? I have a BSc degree in electrical engineering, my specialization being Telecommunication Networks.

I have used the computer as a tool for better learning. For example, to understand what was thought in the analog ICs lecture I experimented with various circuits by simulating them with SPICE. This kind of experimentation was useful for other lectures as well: Signal Processing, Computer Architectures, etc. I have used a wide range of tools: MathCAD, MATLAB, Mathematica, pure C, C++.

During a one month scholarship in Darmstadt I have developed, with a colleague, my first real program. The requirements were: "Write a program that we can use in a lab to teach the graph algorithms in this book. The students should be able to visualize the algorithms. You don't need to implement all algorithms, but design the program such that new ones are easy to add." We thought about two approaches. One was to define an interface (abstract base class actually — it was C++) and use the Adapter pattern. Why use an adapter? Well for each algorithm we usually had two components: one that talked in terms of "red nodes", "edge with tag 2", etc. and another that talked in terms of the problem, e.g. "source node", "destination node". That made it easier to reuse implementations of various algorithms — because those implementations knew nothing about GUI concepts. This is what we have actually implemented. It took two weeks. The other approach was seducing but, since this was our first real project, we were afraid that one month won't be enough for us to finish it. The idea was to design a language in which it is easy to describe various graph algorithms, while including information about how they should be represented. And then we could have structured the GUI as an interpreter.

Another adventure during college was a set of two completely unrelated papers I have presented at some "student conferences". One was about using mathematical methods from transmission lines theory to the study of quantum tunnelling and the other one was an extension of the network optimization problem about which I have already talked.

The first incident that made me like programming (the one with Dijkstra's algorithm) was also a lesson in humility. The second incident that made me like programming was a boost to my self-confidence.

Shortly after I have presented the network optimization paper, with practically no time to prepare, I have participated in the local phase of the ACM ICPC. I found out about its existence a week before and I was curious to see what kind of problems are given. Traditionally in my university, only CS (and no EE) student participated, so no colleague of mine could give me any details. I was very happy that after that phase and a few more tests I was selected to represent the university in the regional phase. After that I started to participate regularly in TopCoder and it is an entertaining experience. I have also started to populate my library with more CS-related books. As a result I have acquired The Art of Computer Programming, Programming Pearls, The Practice of Programming, Introduction to Algorithms, A Course in Combinatorics and other similar books that are a lot of fun to read.

My diploma thesis is written in Romanian. It begins by presenting theoretical concepts behind a fractal-like model of network traffic. Then it introduces a parameter (Hurst) that intuitively quantifies the burstiness of the traffic. I then present some standard estimation methods for this parameter. My idea was to implement a program that uses various methods to estimate this parameter in real time by watching the computer's network card. It turned out that the volume of computation required by the standard methods was to high to do anything in real-time. That's probably one of the reasons why at that time the real-time estimation was done only with specialized hardware (the other reason being that standard network cards aren't great when it comes to time precision). So what I ended up doing was a modification of one of the standard methods that allowed the real time estimation at the cost of some accuracy.

Finally, another good experience from college was being a teaching assistant for two courses: Data Communications and Networking Software. I was in charge with the laboratory sessions. For the Data Communication course I presented low level things about computer networks (e.g., data level protocols). The laboratory for the Networking Software course was actually an introduction to Java. The students were already in the 5th year of study but the lab is introductory because there isn't much programming going on in the EE department.

Since 2003 I work for Nobug Consulting. I was involved in three major projects: s2x, vfe, and e2vera — all of them translators.

On the s2x translator the team had two members (Valentin Gheorghita and me). The tool we wrote (4 months) constructs Verilog and AXE checkers from PSL properties. The most interesting aspect of this project was the fact that it was driven by theory. We started the construction of our algorithms from the formal semantics in the standard. The conciseness and clarity of those rules made me want to find out more about programming languages and the use of formal methods. The result was a high-quality program written much faster than management expected. Another result is a case-study paper.

The Verilog Front-End (vfe) was developed by me for two products of IBM Haifa Research Lab: Rulebase and FoCs. It was an interesting project from a software engineering point of view. I was very much pleased when the author of the requirements document wrote me at the end of the contract: "I am sure that IBM will use it for many years, and it is surely the best piece of software in our group from the point of view of software engineering." Apart from applying common sense, I have used design patterns extensively, I have commented the code thoroughly, I have used tools to compute software metrics like cyclomatic complexity and comments density to pinpoint places that need my attention. And I followed a process very close to what I describe on my blog, except the review part (since I was working alone).

The project on which I work now, e2vera, is a translator between to major languages used in hardware verification. Both of them are roughly the size of Java. The team had 8 members until recently (when two left). The code has a good quality, but lower than what I've seen on previous projects, probably because some of the members are novice coders. The fun part on s2x was the algorithmic part, the fun part on vfe was software engineering; here I have tried to enjoy functional languages by developing many helper tools in OCaml. For example I developed a small prototype to check if my algorithm for detecting translatable temporal expressions really works. To feed this prototype I have wrote a small tool that crawls a directory for `e' code and extracts (and anonymize) all temporal expressions it finds. Another small OCaml script computes cyclomatic complexity of functions. And so on. Not all auxiliary tools were developed in OCaml. I have also wrote a code generator (inspired by something similar that I've done on a smaller scale for VFE) that writes C++ code for the AST data structures starting from a succinct representation of the abstract grammar and a few templates. This mechanism is also used to automatically generate part of the visiting infrastructure. In fact I wrote this tool because when I joined the project the team had a parser and a big document describing about 200 AST classes that they were planning to write by hand! That document was written in good style, though, and we used it as input for my code generator.

In the last few paragraphs I'd like to say a bit more about what I do in my free time.

Recently I decided that it would be good for me to get involved into open source, so I started a project, for which I did not have much time lately (we released a beta version of e2vera recently). But it is an interesting project. First of all, if you write open source you know that others will see your code. This makes you try to be on your best behavior and this increases software's quality: it is more readable, more maintainable, etc. Second it is a good opportunity to practice using functional programming seriously. Functional programming is very important for the different, simpler mindset that it forces onto you. Reading code like the one in Chris Okasaki's Pure Functional Data Structures is sure to give you a new perspective on programming (at least if you came from the imperative realm). And third, it is a good way to practice coding more complex algorithms and data structure than any TopCoder could ever ask for (because of time constraints). Since it is a search application, performance is especially important. Advanced data structures are a must.

Other things that I enjoy: reading from TAOCP in the morning, solving puzzles from IBM, communicating over the Internet with people I don't know. In fact I think that browsing through mail archives can give you a good idea of various things that I've done in my free time at various times.

What I expect from a PhD program is a challenging and supportive environment. I want people to feed me with problems ("hey, I'm trying to do this, I've found this related literature; do you know some other results that might help me or maybe you have an idea about how to proceed?"). And I want people to which I can go with the same kind of questions when I am stuck.

I also like to program. Programming is a creative and very much enjoyable activity. So I want my PhD to involve some programming.

With respect,

No comments:

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.