This fall, we (Team PLClub) participated in the Third Annual ICFP Programming Contest and were lucky enough to win the 1st prize. So we here made up a web page to brag;-) about how we did it.
In the second evening of ICFP 2000, when Greg Morrisett announced the 1st prize winner, I could almost hear people wondering: "hey, I know Camls 'R Us and Galois Connections, but who in the world are PLClub?" Right, most of us are not as famous as Xavier Leroy (though Jerome may be well-known as a developer of "O" in OCaml) and we kept silent on purpose: to make it more fun.
So who are we? In short, we are de facto students of Benjamin Pierce in the University of Pennsylvania, though we all have somewhat complex status.
Haruo, who participated in the contest last year, invited Eijiro to participate in the contest this year. He registered Team PLClub. Jerome and Vladimir joined soon. We agreed to use OCaml a priori with no discussion at all.
The contest began. The task was a kind of "programmable" ray tracing.
However, Haruo was down because he caught a cold, and Eijiro was away because he participated in another conference and his flight from State College to Philadelphia was canceled! Fortunately, both Haruo and Eijiro recovered and arrived in the (late) evening.
We were glad when we heard the task, because Eijiro and Haruo had some experience of ray tracing in another contest in U-Tokyo and we knew that it's not so difficult as it seems. Eijiro and Haruo looked into the ray tracer that Eijiro wrote several years ago in C++ (which was a pain, of course), though it helped little.
Meanwhile, Jerome finished the parser and (the initial version of) the interpreter.
Then we slept. Yes, throughout the contest, we did sleep enough as usual in our homes.
We all began mathematics for the ray tracer. The main issue was how to compute the intersections of a line and various kinds of surfaces. We struggled to recall high-school mathematics such as matrix representation of quadratic surfaces.
By the end of the day, we implemented everything from Tier-1 to Tier-3. There was nothing special in this implementation; we just followed the description of the task. The only technical point was that we represented vectors and matrices as arrays rather than tuples or records for the sake of convenience (to use for-loops in matrix multiplication, for example), though they could be a little less efficient and less safe.
However, our images looked rather different in color from the judges' samples. We wondered why and Jerome computed the colors of some points (in the ball of "golf") by hand. The results were quite similar to ours and much brighter than the judges'! He sent an e-mail to the judges, but it took more than 1 day till they fixed the bug.:-)
Now that we implemented everything, we began testing, debugging, and optimizing our code.
Vladimir created the image below for the purpose of testing.
Jerome implemented detection of constant surface functions: when a surface function is applied for the first time, it is applied to arguments of wrong types; if it doesn't raise an error, it must be a constant function (because of parametricity -- no side effects were possible in surface functions), so it is substituted with the constant. This optimization led to a significant speedup.
Eijiro implemented de Bruijn indices, which removed the overhead of variable name lookups in the interpreter. He also implemented inlining of (the values of) the free variables in functions. At first, the inlining was performed for every function, which was too slow. Thus, it was performed "lazily" at runtime for surface functions only. These two optimizations led to another significant speedup, in particular for non-constant surface functions.
Vladimir and Haruo tested this implementation and found a bug in cones. (We expected this because cones were the most difficult objects to implement.) Jerome fixed it immediately.
Jerome implemented a kind of "cheating" that Eijiro suggested: compute the pixels in odd coordinates and interpolate the other pixels if the neighboring pixels are similar enough. This worked really well---it led to a speedup by the factor of 3 on average with minimal change of the output---but it was also a rather controversial "optimization". After some hot discussions, we decided to submit BOTH the one without interpolation and the one with interpolation, and let the judges choose. After all, the judges chose the one WITHOUT interpolation, which was already fast enough to win the 1st prize.
Haruo created two images: a glass and a rotating spoon. The rotating spoon actually consists of several images for animation, as you can see below (in major browsers). To stop the animation, click the "stop" button of your browser.
Eijiro derived a compiler from the interpreter, using some techniques of partial evaluation. The compiler translates GML code into Caml code, compiles it, links it with the ray tracer, and executes it. Unfortunately, this compiler was not very useful in practice, because ocamlopt(.opt) was too slow and often crashed for the complex programs that the compiler produced.
Jerome implemented bounding spheres, which helped very much to omit intersection computation. This led to yet another significant speedup.
Finally, we wrote the README, made the statically linked executable, put the tar.gz on our web server, and submitted our entry.
We found two small bugs, one in the "buildme" script (it started as "#!bin/sh" with "/" missing, because we copy-and-pasted the sample, which had the same bug) and the other in the GML parser (it failed to parse constant floats with negative exponentials). Fortunately, both of them were so trivial that the judges didn't take them as problems.
We forget to implement reflection cutting, that is, stop recursive tracing when the light becomes weak enough. (We noticed that on seeing the entry by Camls 'R Us.) As a result, our tracer is somewhat slow when the scene has many mirrors.
There were interesting chats about this contest in the Caml mailing list and the Haskell mailing list. We began to expect something.
The notification came when we were wondering what has become of our entry. We were anyway going to attend ICFP 2000, but we now saved our fund.
The judges published the results. We were much faster than others -- actually, even faster than we thought! In some cases, the difference was much larger in the judges' tests than in our experiments. We don't know why... We are now trying to see the reason. (Could it be because the judges used not very expensive machines?) We will report it here if we find it.