24 February 2009

Advice to beginner programmers

This is a slightly edited version of an email I sent to the students in my UNIX programming class. It's a class where they have to write many small C programs as homework to get points.

Dear students,

It is a good time, now that a quarter of the term has passed, to reflect upon your performance in this course (and others) and what you can do to improve. My advice is to focus on two things: attitude and work style.

Attitude

You should program for fun. If you're not having fun then you are doing something wrong. I'm not talking about the type of fun you have when you laugh at a friend's jokes. Don't expect your program to tickle either. It's a different type of fun, that lasts longer and is less spiky. Like other types of fun it ends with pleasure, the pleasure of seeing that your program solves a problem that seemed difficult. Perhaps. Or perhaps it seemed easy at first but then you realized because of the picky WebEval that there was more to do than you realized initially.

The point is: Program for the pleasure of finding things out and for the pleasure of seeing your program doing exactly what it is supposed to. The points will come.

Work style

Don't write random programs in the hope they will work. Submitting code that doesn't compile won't get you any points. Similarly, submitting the reference solution for problem A as a solution for problem B won't get you any points. These kind of practices are hopeless. Better go have a pint if you plan to keep doing it and save yourself some time. Or, better yet, try to listen to the advice I gave you and I repeat below.

Work environment. Start by setting up a work environment. It should at the very least allow you to easily compile, run, and test your program. If these three tasks are hard then you do not have a proper working environment.

Understand the problem. The most important step is to understand what it is that your program should do. Read the problem statement. Reread the problem statement. Reread the problem statement. And once more. Write (on paper) a few other possible inputs and check that you know without doubt what the output should be. If you don't understand the problem then ask. Don't say "I don't understand the problem" because the only answer you'll get is "reread the statement". Instead say "can x be negative?", "is this the correct output for this input?", "do we need to have some background knowledge on this Caesar guy to understand the problem", "is there a time limit?", "when you say that other characters should be ignored, do you mean that they can be interspersed anywhere between the characters that shouldn't be ignored?", and so on.

Find a solution. Once you know what you have to do (and only then) proceed to devise a solution. The easiest way to do so is to solve by hand a few cases and observe what you do. The 'solution' is a description of what you see yourself doing. The description has to be clear and unambiguous. Otherwise it's not an algorithm. Write it on paper as a recipe. If you can't then try harder. Don't just go to the next step because you'll just waste time. Look at other examples. Try to keep track on paper of everything that is going on in your head. Ask yourself what are the smaller steps. How did you know the answer is x? Never settle for "I just know".

Don't bite too much. You know what the problem is and you have your recipe written down? Then it's time to start programming. Your program must compile at all times. If the compiler reports an error, never fix it by randomly changing code in the vicinity of the error. Fixing the code without understanding what was wrong in the first place is worse than not fixing it at all: You missed an opportunity to learn. If you have a big pile of code with >10 errors then it's like trying to learn how to swim by asking someone to drop you off in the middle of the Atlantic. You won't learn to swim. Most likely you'll just die. So don't do it. Write your program incrementally. Keep it compiling. Run it from time to time and check that it does what you expect. This means that you should think about what you expect before running the program. Any mismatch between what you expected to happen and what happened is an opportunity to learn. Don't miss it. Try to understand what is happening.

Whenever you have an issue you can't explain it helps to try to reproduce it on the simplest example. If the issue is a compilation error, then make a copy of your program and then systematically trim down the code in a way that preserves the error. Do this until you get to the smallest possible program that still exhibits the error. That is, there's nothing more you can trim without making the error go away. By the time you get to this step you probably know what was wrong, you can fix the problem in the small example, test, and then repeat the fix in your big program. If the example is tiny and you still don't know what's wrong then ask in the forum (or a demonstrator if you are in the lab).

Tests. You should test your program as it grows by running it. Once it approaches what you think is close to a final solution your testing must become more thorough. Do not just run the program, type in the input, and look at the output on the screen. Write all your tests (inputs and outputs) in files so that you can retest easily after you change the program.

Cover all special cases you can think of with your tests. The "Encode Caesar" problem had shifts -1, 0, 1, 2 (always try these if they are valid input), 26 (because that's the size of the alphabet), 231-1 and -231 (the biggest and the smallest possible input). For each of them you should write the output by hand and see whether the program's output matches. The "Reverse" problem had empty lines, lines with one character, lines with two characters, lines with 500 characters, lines that contain only one character repeated many times, lines that contain palindromes, lines that contain only distinct characters.

Cover extreme cases. Special and small cases are there to check for correctness. You must also check that your program is not horribly slow. Let's put it this way: Will you finish the job of solving one of the biggest test-cases if you do it by hand before you die? If not, then your program is probably horribly slow. Since a computer can do simple operations about 109 times faster than you can and you will live around 100 years (hopefully more) any program that takes more than 3 seconds is horribly slow. To cover extreme cases you may want to download much data (like Gutenberg books for the Caesar problems) or you might want to write a different program, a test generator (for example, for Postfix Evaluator).

Bug finding and killing. What to do if the program does not do what you expect? First, do the easy thing: Run the compiler with all warnings turned on (-W -Wall -pedantic) and get check if any causes the problem. In any case, get rid of them. Then run your program with valgrind to make sure it's not the fault of nasal demons. If it is then try to isolate the problem in the same way as you isolate a compiler error: Trim down a copy of your program. Then time your program (time ./program < test.in > test.out) on a big test case. Try to limit its memory (see limit.c in the forum). It might be too slow or too obese.

If all this fails it's time for serious debugging. There are two basic debugging strategies: (1) read the program carefully and (2) trace your program for particular runs. The latter is easier but as you become more experienced you'll see that the former is much faster for simple bugs. Anyway, how do you trace a particular run of you program? Read the program having in mind a particular input. At each step think what are the values of the variables. If you feel lazy (and that's often a good thing when you are a programmer; we are after all in the business of automating stuff), then put printf statements that print the values of variables at intermediate points and check when their values diverge from what you expect. Another option is to use a tool like gdb that lets you go step by step thru the program.

Labs and textbook. Are the homeworks too hard? Then don't do them. Or rather, do them after you do the labs and you read the corresponding piece from the textbook. K&R is likely the greatest book ever to be published about a programming language. When it first appeared it was a complete revolution compared to the other language books. And a few decades later it still ranks among the best even though almost everybody tries to imitate the style nowadays. So, you have a great textbook: Use it. Read thru Chapter 1 and Chapter 2 and solve the exercises as you find them. They are the same exercises you should be doing in the labs. How to tackle an exercise? For all those that ask you to write code follow the approach outlined above.

Ask in the forum. I am yet to see a programming question in the forum. This would suggest that you have no problem tackling homeworks. Or you are extremely shy. Don't be. Or there's some other secret reason I can't imagine. If you tried to setup you work environment and you hit an issue, ask. If you tried to fix a compile error for 10 minutes, ask and attach the trimmed down version of the code that exhibits the error. If you have a bug reported by valgrind and you can't figure out what's wrong even after you trimmed down your code, then ask.

You ask "where do I get the input from?" and once you receive the answer "from standard input" you go "huh"? Then ask: "and how do I read from standard input?" I would have told you at least to lookup fgets, scanf, getchar. Then you might have asked "ok, now how do I send data to my program when I run it? do I have to type it at the keyboard every time?" And so on. There are so many questions you could have asked but didn't. Unless you ask, I can't read your mind and figure out what stumps you. Again, be specific, as in the examples in this paragraph.

Now read this again.

PS: If you notice a colleague that is looping forever due to the statement above then break them.

No comments:

Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.