buffered IO performance
February 11, 2007
Next to the raw-IO performance which is important for heavy, static file transfers the buffered IO performance is more interesting for sites which have a small set of static files which can be kept in the fs-caches.
As we are using hot-caches for this benchmark the “lightness” of the server becomes important. The less syscalls it has to do, the better.
The test-case is made up of 100MByte of files in the size of 10MByte and 100kByte.
Benchmark
100kByte
100MByte of 100kBytes files served from the hot caches:
| lighttpd | |||
|---|---|---|---|
| backend | MByte/s | req/s | user + sys |
| writev | 82.20 | 802.71 | 90% |
| linux-sendfile | 70.27 | 686.32 | 56% |
| gthread-aio | 75.39 | 736.23 | 98% |
| posix-aio | 73.10 | 713.88 | 98% |
| linux-aio-sendfile | 31.32 | 305.90 | 35% |
| others | |||
| Apache 2.2.4 (event) | 70.28 | 686.38 | 60% |
| LiteSpeed 3.0rc2 | 70.20 | 685.65 | 50% |
linux-aio-sendfileis loosing most of its performance as it has to useO_DIRECTto operation which always is a unbuffered read.- Apache, LiteSpeed and
linux-sendfileare using the same syscall:sendfile()and end up with the same performance values - gthread-aio and posix-aio perform better than
sendfile() write()performs better thanthe threaded AIOandsendfile()I can’t explain that right now :)
10MByte
100MByte of 10MBytes files served from the hot caches. The benchmark command has been changed as in the other benchmarks:
$ http_load -verbose -timeout 40 -parallel 100 -fetches 500 http-load.10M.urls-100M
http_load is doing a hard cut when we are using the -seconds option and we might lose some MByte/s due to incomplete transfers.
| lighttpd | |||
|---|---|---|---|
| backend | MByte/s | req/s | user + sys |
| writev | 82.20 | 8.76 | 80% |
| linux-sendfile | 53.95 | 5.65 | 40% |
| gthread-aio | 83.02 | 8.66 | 90% |
| posix-aio | 82.31 | 8.60 | 93% |
| linux-aio-sendfile | 70.17 | 7.35 | 60% |
| others | |||
| Apache 2.2.4 (event) | 50.92 | 5.33 | 40% |
| LiteSpeed 3.0rc2 | 55.58 | 5.80 | 40% |
- all the
sendfile()implementations seem to have the same performance problem. writve()and thethreaded AIObackends utilize the network as expectedlinux-aio-sendfileis faster as the bufferedsendfile()even if it has to read everything from disk … strange