Powered by Lighttpd

April 04, 2007

lighttpd is used by many well-known sites. The typical scenario is using lighttpd as off-load server to push out static content and leave to complex work to another server.

One example is YouTube. They have a farm of servers which push out the thumbnails you see before you see the movies:

$ curl -I http://sjl-static16.sjl.youtube.com/vi/TgF_eRkfqEY/2.jpg
HTTP/1.1 200 OK
Content-Type: image/jpeg
ETag: "983726135810477085"
Accept-Ranges: bytes
Last-Modified: Fri, 03 Feb 2006 04:32:53 GMT
Content-Length: 3495
Date: Wed, 04 Apr 2007 06:49:51 GMT
Server: lighttpd-aio/

As you see in the name-scheme, there are some more of those servers pushing out content.

On wikipedia they run at least 2 servers with lighttpd:

  • upload.wikimedia.org
  • download.wikimedia.org

While download is used to distribute the SQL dumps of the database when tend to grow above the magic 4GByte border, upload is used to push out all the images and resize them when necessary.

$ curl -I http://upload.wikimedia.org/wikipedia/commons/thumb/2/21/Mandel_zoom_00_mandelbrot_set.jpg/250px-Mandel_zoom_00_mandelbrot_set.jpg
HTTP/1.0 200 OK
Content-Type: image/jpeg
ETag: "6460328581220324712"
Accept-Ranges: bytes
Last-Modified: Mon, 04 Dec 2006 22:24:53 GMT
Content-Length: 5973
Date: Wed, 14 Mar 2007 11:16:19 GMT
Server: lighttpd/1.4.13
X-Cache: HIT from sq13.wikimedia.org
X-Cache-Lookup: HIT from sq13.wikimedia.org:3128
X-Cache: HIT from knsq12.knams.wikimedia.org
X-Cache-Lookup: HIT from knsq12.knams.wikimedia.org:3128
Age: 6265
X-Cache: HIT from knsq10.knams.wikimedia.org
X-Cache-Lookup: HIT from knsq10.knams.wikimedia.org:80
Via: 1.0 sq13.wikimedia.org:3128 (squid/2.6.STABLE12), 1.0 knsq12.knams.wikimedia.org:3128 (squid/2.6.STABLE12), 1.0 knsq10.knams.wikimedia.org:80 (squid/2.6.STABLE12)
Connection: close

thread starvation

March 05, 2007

While running several benchmarks I saw that http_load was getting time outs for some of the connections. This has been seen with all web-servers and different backends in lighttpd as well.

After patching http_load to handle timed out and byte count errors correctly, I could easily separate the time outs from other problems. In one of the last changesets I added a infrastructure to track the time spent in lighttpd for a single connection including the time spent in the different stages of the gthread-aio backend:

  • scheduling the threaded-read()
  • starting the read() in the thread
  • waiting until it is finish
  • sending the result into the main-loop
  • writing the buffered data to the socket

You can enable this timing by setting the define LOG_TIMING.

network_gthread_aio.c.78: (trace) write-start: 1173101341.237616 read-queue-wait: 680 ms read-time: 0 ms write-time: 21 ms
network_gthread_aio.c.78: (trace) write-start: 1173101341.229014 read-queue-wait: 576 ms read-time: 0 ms write-time: 134 ms
network_gthread_aio.c.78: (trace) write-start: 1173101341.240815 read-queue-wait: 681 ms read-time: 6 ms write-time: 19 ms

I wrote a script to extract this timing data from errorlog and used gnuplot to turn it to a images:


## parse the errorlog for the lines from the timing_print
## - extract the numbers
## - sort it by start-time
## - make all timestamps relative to the first start-time

cat benchmark.error.log | \
    grep "network_gthread_aio.c.78:" | \
    awk '{ print $4,$6,$9,$12 }'  | \
    sort -n | \
    perl -ne '@e=split / /;if (!$start) { $start = $e[0]; } $e[0] = $e[0] - $start; print join " ", @e; ' > $1.data

cat <<EOF
set autoscale
set terminal png
set xlabel "start-time of a request"
set ylabel "ms per request"
set yrange [0:30000]

set title "$1"
set output "$1.png"
plot \
"$1.data" using 1:2 title "read-queue-wait-time" with points ps 0.2, \
"$1.data" using 1:(\$2 + \$3) title "read-time" with points ps 0.2, \
"$1.data" using 1:(\$2 + \$3 + \$4) title "write-time" with dots

set title "$1 (read-queue-wait)"
set output "$1-read-queue-wait.png"
plot \
"$1.data" using 1:2 title "read-queue-wait-time" with points ps 0.8

set title "$1 (read)"
set output "$1-read.png"
plot \
"$1.data" using 1:3 title "read-time" with points ps 0.8 pt 2

set title "$1 (write)"
set output "$1-write.png"
plot \
"$1.data" using 1:4 title "write-wait-time" with points ps 0.8 pt 3


The first benchmark was taking:

./http_load -parallel 100 -fetches 500 ./http-load.10M.urls-10G


server.max-read-threads = 64
## compiled with 64k read-ahead

The time spent with read()ing the data from the disk goes up:

64 threads

more detailed:

64 threads, read() calls

If you reduce the threads to 4 you get:

4 threads

and the read-time drops to:

4 threads, read() calls

Interesting for our time-outs are only the dots which leave the 4 sec range as that are our starving read() threads. If it takes too long for them to finish the client will close the connection and the user will get a broken transfer.

Reducing the number of threads helps to limit the impact of the problem as we can see above in the graphs:

threads-runnable = threads-started - threads-blocked

The probability of a stuck thread to get CPU-time again is increasing the more threads are getting stuck as less threads can run. In the worst case all available threads are waiting and at least one of them will get finished.

more graphs

Rule of thumb

Keep max-threads at twice the number of disk.

buffered IO performance

February 11, 2007

Next to the raw-IO performance which is important for heavy, static file transfers the buffered IO performance is more interesting for sites which have a small set of static files which can be kept in the fs-caches.

As we are using hot-caches for this benchmark the “lightness” of the server becomes important. The less syscalls it has to do, the better.

The test-case is made up of 100MByte of files in the size of 10MByte and 100kByte.



100MByte of 100kBytes files served from the hot caches:

backendMByte/sreq/suser + sys
Apache 2.2.4 (event)70.28686.3860%
LiteSpeed 3.0rc270.20685.6550%
  • linux-aio-sendfile is loosing most of its performance as it has to use O_DIRECT to operation which always is a unbuffered read.
  • Apache, LiteSpeed and linux-sendfile are using the same syscall: sendfile() and end up with the same performance values
  • gthread-aio and posix-aio perform better than sendfile()
  • write() performs better than the threaded AIO and sendfile() I can’t explain that right now :)


100MByte of 10MBytes files served from the hot caches. The benchmark command has been changed as in the other benchmarks:

$ http_load -verbose -timeout 40 -parallel 100 -fetches 500 http-load.10M.urls-100M

http_load is doing a hard cut when we are using the -seconds option and we might lose some MByte/s due to incomplete transfers.

backendMByte/sreq/suser + sys
Apache 2.2.4 (event)50.925.3340%
LiteSpeed 3.0rc255.585.8040%
  • all the sendfile() implementations seem to have the same performance problem.
  • writve() and the threaded AIO backends utilize the network as expected
  • linux-aio-sendfile is faster as the buffered sendfile() even if it has to read everything from disk … strange