Benchmarks

February 02, 2007

In the article lighty 1.5.0 and linux-aio we proposed a benchmark suite for measuring the performance for a disk-io-bound application.

http_load

As load-generator we use http_load as it

  • allows random fetches from a list of URLs
  • allows a large number of parallel requests
  • is portable

On the command-line we want to execute it like:

$ ./http_load -verbose -parallel 100 -fetches 10000 urls

file pool

On the machine which supposed to be tested we generate 2 sets of 10Gbyte files. One is of 100,000 files of 100kbyte size, and the other is 1.000 files of 10MByte size.

$ cd $docroot
$ mkdir -p seek-bound/100k/
$ cd seek-bound/100k/
$ for i in `seq 1 1000`; do 
  mkdir -p files-$i; 
  for j in `seq 1 100`; do 
    dd if=/dev/zero of=files-$i/$j bs=100k count=1 2> /dev/null; 
  done; 
done

The file-pool is 10 times larger than the available RAM on the server-host. Based on this disk layout we generate the list of URLs for http_load.

$ find ./seek-bound/100k/ | grep 'files.*/.' | sed 's#\./#http://192.168.2.106/#' > http-load.100k.urls

The same commands are executed for the 10MByte files to generate a file-set which check the performance for large files.

hardware

The test-network is made up of:

  • Netgear GS108, a 8-port Gigabit Switch

  • client:

    • OS: WinXP Prof. 64-bit
    • CPU: AMD64 X2 (dual core) 4200+
    • Network: Intel Pro/1000
  • server:

    • OS: Linux 2.6.16.21-0.25-default x86_64 (OpenSuse 10.1)
    • CPU: AMD64 3000+
    • Network: Intel Pro/1000
    • Modules: stock, but ip_conntrack is rmmod‘ed
    • Disks: 2 SATA disks as RAID1 via the md-driver

The disks are:

Model Number:       ST3160827AS
Serial Number:      5MT02VGJ and 3MT08WDV
Firmware Revision:  3.42
$ cat /proc/mdstat
    Personalities : [raid1]
    md0 : active raid1 sda2[0] sdb2[1]
           155235968 blocks [2/2] [UU]

lighttpd

February 02, 2007

Security, speed, compliance, and flexibility–all of these describe LightTPD which is rapidly redefining efficiency of a webserver; as it is designed and optimized for high performance environments. With a small memory footprint compared to other web-servers, effective management of the cpu-load, and advanced feature set (FastCGI, CGI, Auth, Output-Compression, URL-Rewriting and many more) LightTPD is the perfect solution for every server that is suffering load problems. And best of all it’s Open Source licensed under the revised BSD license.

light footprint + httpd = LightTPD (pronounced lighty)

Web 2.0

lighttpd powers several popular Web 2.0 sites like YouTube, wikipedia and meebo. Its high speed io-infrastructure allows them to scale several times better with the same hardware than with alternative webservers.

This fast web server and its development team create a webserver with the needs of the future web in mind:

Its event-driven architecure is optimized for a large number of parallel connections (keep-alive) which is important for high performant AJAX applications.

The Server

January 30, 2007

Back in the days of February 2003 I (Jan) was finishing my thesis. Writing a thesis is as far as I can say the most frustrating job you can do.

While my subject was great (‘Development of a handheld system to monitor and control CAN-bus systems in a non-automative environment’) and the hard- and software development was a real challenge I had to write the thesis which documented the design-decision, the preparations and so on.

For a developer this is the real hard part: writing the documentation. Especially as the date when you have to finish the thesis gets nearer every day. To motivate yourself you need a distraction.

For me it was a proof-of-concept of the c10k problem written by Dan Kegel. How to handle 10000 connections in parallel on one server. I already had seen apaches killing systems because they ran out of memory into swap with only 100 parallel connections.

In the first weeks it was only a challenge how to write something fast, optimized. In the ChangeLog you can still see my comments how to achieve this. Cache as much as you can. Why regenerated the timestamp for the ‘Date:’ header 1000 times per second if it is the same all the time ?

As I needed PHP support for my own purposes it was one of the first features added. The ChangeLog says I had it working two weeks are the rest was done. Including the load-balancing to distribute the load from one webserver to multiple fastcgi-backends.

At one point I asked myself: This is just a proof-of-concept. Where are we now ? How do we compare to the other servers ?

The first opponent was thttpd, the big single-threaded webserver. Especially on large files we outperformed it 2 to 5 times. Next were boa and mathopd, both with problems and slower. Zeus was the first real challenge and they proof that Zeus is a great webserver. If you want to spend money on a webserver (next to asking me to develop something for you) is buying licenses from Zeus. (no, I’m not payed by them in any way).

Different optimisations were added: new event-handlers like epoll and kqueue, new network backends like sendfile().

So, I look back now on 2.5 years of development see the numbers of installations rising every month. http://news.netcraft.com/ is telling as every month that the numbers are still increasing and I send a mail to the mailinglist, so everyone know and keeps up the joy that this little proof-of-concept went into a well working webserver.