Bottleneck Guitar

Introducing Bonnie

Bonnie reflects my prejudices as to what the real bottlenecks are in server applications. These were formed by work first at the New Oxford English Dictionary project at the University of Waterloo, then at Open Text Corporation.

Bonnie was written in 1988, and revised in 1989 and 1996.

To: Textuality . Bonnie

Why Bonnie?

I believe that:

What Bonnie Does

Bonnie performs a series of tests on a file of known size. If the size is not specified, Bonnie uses 100 Mb; but that probably isn't enough for a big modern server - you your file to be a lot bigger than the available RAM

Bonnie works with 64-bit pointers if you have them.

For each test, Bonnie reports the bytes processed per elapsed second, per CPU second, and the % CPU usage (user and system).

In each case, an attempt is made to keep optimizers from noticing it's all bogus. The idea is to make sure that these are real transfers between user space and the physical disk. The tests are:

1. Sequential Output

1.1 Per-Character

The file is written using the putc() stdio macro. The loop that does the writing should be small enough to fit into any reasonable I-cache. The CPU overhead here is that required to do the stdio code plus the OS file space allocation.

1.2 Block

The file is created using write(2). The CPU overhead should be just the OS file space allocation.

1.3 Rewrite

Each Chunk (currently, the size is 16384) of the file is read with read(2), dirtied, and rewritten with write(2), requiring an lseek(2). Since no space allocation is done, and the I/O is well-localized, this should test the effectiveness of the filesystem cache and the speed of data transfer.

2. Sequential Input

2.1 Per-Character

The file is read using the getc() stdio macro. Once again, the inner loop is small. This should exercise only stdio and sequential input.

2.2 Block

The file is read using read(2). This should be a very pure test of sequential input performance.

3. Random Seeks

This test runs SeekProcCount (currently 4) processes in parallel, doing a total of 4000 lseek()s to locations in the file computed using by random() in bsd systems, drand48() on sysV systems. In each case, the block is read with read(2). In 10% of cases, it is dirtied and written back with write(2).

The idea behind the SeekProcCount processes is to make sure there's always a seek queued up.

AXIOM: For any unix filesystem, the effective number of lseek(2) calls per second declines asymptotically to near 30, once the effect of caching is defeated. [ I wrote the previous sentence in about 1988, and it's a bit better now, but not much ]

The size of the file has a strong nonlinear effect on the results of this test. Many Unix systems that have the memory available will make aggressive efforts to cache the whole thing, and report random I/O rates in the thousands per second, which is ridiculous. As an extreme example, an IBM RISC 6000 with 64 Mb of memory reported 3,722 per second on a 50 Mb file. Some have argued that bypassing the cache is artificial since the cache is just doing what it's designed to. True, but in any application that requires rapid random access to file(s) significantly larger than main memory which is running on a system which is doing significant other work, the caches will inevitably max out. There is a hard limit hiding behind the cache which has been observed by the author to be of significant import in many situations - what we are trying to do here is measure that number.