Tuesday, May 05, 2009

A benchmark where clang wins gcc

I have sent my last post about clang to a friend, suggesting that the final days of gcc could be coming, and he refuted with a very pragmatic point: "talk is cheap, show me the (generated) code". :-P So I decided to get serious.

I have already told in the article that clang is in development stage, so how could I show him a working example ? I know clang able to compile large projects, like gcc and FreeBSD. After playing with some small projects, I found this one, from the Computer Language Benchmarks Game. The benchmark is called fasta, and it generates pseudo random DNA sequences. I chose this one because it was small and, even though floating point was used in the program, outputs could be directly compared without rounding errors (becuase they're composed of DNA elements).

My system: My machine is a HP Pavilion dv9000, Core 2 Duo T5500 1.66 GHz, 2GB RAM, with Ubuntu 9.04.

The compilers: the old guy is gcc 4.3.3 from the standard Ubuntu package. The new guy is clang, downloaded today from svn (llvm rev 71035/clang rev 71041). llvm and clang were compiled with "--prefix=/opt/clang --enable-optimized".

The execution: I've just used GNU time repeatedly until the execution time converged, chose a result at random and rounded it to one decimal figure. Remember that I just have to convince a friend. :-P Standard output was redirected to /dev/null.

At first I did this:
$ gcc -O3 -o b b.c
$ /opt/clang/bin/clang -O3 -o bb b.c
$ time ./b 25000000 > /dev/null
$ time ./bb 25000000 > /dev/null

But this gave incredible 11.1s for gcc and 7.4s for clang. It was a good surprise to see clang surpassing gcc, but there was something wrong, since gcc is not a bad compiler. A quick look in assembly output showed that clang was using SSE registers for moving data around and for calculating, while gcc was not. This is a good point for the clang driver, but it doesn't tell us anything. So, the correct command line for gcc is:

$ gcc -O3 -march=native -mmmx -msse -msse2 -msse3 -mfpmath=sse -o b b.c

Then we finally settle with the results: 7.7s for gcc and 7.4s for clang. Clang wins !

1 comment:

Isaac Gouy said...

If you notice other programs which you know would perform better with different compiler options then let us know!

Here are the updated fasta measurements.