Server Tuning for High Performance
by Jay Lorenzo A few months ago, I was involved in a change of server platforms of the external servers at WRQ. Looking at it in retrospect, it was a great learning experience that showed how years of maintaining a stable platform with only minor tuning issues can lull you into a false sense of security-particularly when you change both server hardware and software. What I'd like to discuss this month are a few observations I've gained from this experience, and hopefully shed some light on a few approaches you can take when optimizing server performance. This column is definitely going to show some Unix bias, but fundamentally, the same tuning issues apply to any Web platform.
I tend to see tuning as being divided into four categories: CPU usage, disk I/O, memory utilization, and network performance. In reality, many times a certain tuning issue can span multiple categories, but if you tend to analyze your server as four distinct processes, you will tend to find more potential areas for improvement than if you had viewed the server as one system. Let's look at some of the potential tuning issues you face when taking this approach.
I find it useful, when tuning, to try to get a baseline measurement of these four parameters over the course of a week. This gives me an idea of when peak load times occur, and an understanding of what resources are being used by the various processes on the server. You will find baseline readings to be particularly useful as a comparison tool when your server starts misbehaving. Some Web administrators tend to focus solely on HTTP server log files for baseline measurements.
While HTTP server logs can provide a great deal of information, they are only a part of the total picture. Most platforms have system accounting and performance monitoring utilities that can log CPU, disk, and memory activity. Although these programs can themselves alter the accuracy of the readings, they are very useful. Other tools, such as netstat, should be employed to gain additional information for snapshots of network performance.
CPU utilization has been the least troublesome tuning issue, in my experience. HTTP services aren't CPU-intensive, so the major focus when addressing CPU issues is to make sure that you have a server fast enough to process the number of HTTP operations per second that your service requires. Closely linked to this is the need to run HTTP server programs that are well designed, and to make sure you are maximizing the efficiency of your CGI, database or other gateway programs.
One area that is often overlooked is the amount of resources that a search and indexing engine will consume. This is a good argument for baseline measurements which will indicate which processes are responsible for the bulk of resource use.
When trying to improve the throughput of CGI programs, consider writing directly to the server's API if available. Netscape, Microsoft and Apache all have very robust APIs available for their respective platforms which allow you to implement programs that can save both CPU and memory consumption. The downside, of course, is that the APIs usually require programs written using C or C++ in threaded environments, which increases the time for deployment, but if you are trying to maximize throughput, it is well worth the time spent.
Just Disking Around
Disk I/O requires a close look when optimizing performance. Almost all of the systems I work with are SCSI-based disk subsystems, which I believe give greater flexibility in drive configurations. I tend to favor using a number of small (typically 1 GB or less) hard drives as opposed to one or two large drives. This is due to the fact that disk I/O can be a substantial bottleneck on fast systems, and having multiple disks seeking and retrieving information can be a much more efficient use of resources.
On heavily-used servers, I recommend dedicating a single drive just for log files, as I find it to be the most active file system under load. Databases also fall under this requirement, as the file systems that contain database records may need to be spread among several disks to improve database throughput. When the situation permits, I prefer separating database servers from Web servers, as they tend to have individual tuning requirements that are not always compatible.
Down Memory Lane
Memory tuning can be both science and art, as it covers a great deal of territory especially when dealing with Unix systems. Memory utilization affects swap space, network buffering, application and OS memory usage and caching. Each one of these areas can impact server performance substantially. The most difficult part of memory tuning is understanding which process, or processes, are responsible for the consumption of memory.
Unix programs such as swap, sar, ps, and top can help you identify the memory usage requirements for your OS and applications. On Windows NT, the included Performance Monitor is an excellent tool for performance analysis. The primary rule is to remember that you can never have too much memory. If you are running a Unix system, you must minimize the paging of memory to disk, since server performance will suffer when paged memory is used.
In addition to application and OS memory usage, make sure that you retain sufficient memory for network usage. Your servers are very dependent upon memory to be used as network buffers, which are responsible for the receipt and delivery of data to your server. Network buffer requirements can be identified by running netstat -m on Unix hosts. Pay close attention to any indications of memory requests being denied or delayed, which would indicate insufficient memory. It may be more efficient to decrease the size of the buffers servicing requests, as much of the HTTP traffic normally found at Web sites is considerably smaller than typical default buffer sizes.
Another network memory issue that merits notice is the amount of memory used for reverse DNS lookups when logging host names. You can save a considerable amount of memory and network usage by turning off DNS lookups, and possibly reassigning it to another machine that processes the log files offline and does the reverse resolution at that time.
Stuffing the Pipe
Network performance is the hardest component to measure consistently, which should come as no surprise when you consider how much the Internet changes. In the same vein as memory, there is no substitute for more capacity when it is needed.
One of the best tuning tricks for Unix servers involves increasing the size of the listening queues for incoming TCP connections. This comes about due to the long delays that are inherent with modem-based connections; your typical user is probably visiting your site at 28.8 Kbps or slower, and due to the amount of time it takes your server to service a request, it needs to maintain a longer queue length to hold onto the other incoming requests that must be responded to.
As mentioned earlier, it is sometimes beneficial to modify the number of buffers, as well as the buffer size, to improve performance. I usually prefer to make sure I have sufficient memory before I attempt to tweak buffer sizes. For Unix, the netstat -s command can give a wealth of information. Windows NT will also give you usable output with netstat, but not quite the detail that you will see on most Unix systems.
As we undergo growing pains in the Internet infrastructure, you will find that special attention has to be placed on the number of open connections your server maintains. TCP connections start and finish with a 3-way handshake. Unfortunately (due to the fact that many connections are broken off before they are properly closed), you may find that your machine is maintaining a large number of broken connections, which is, in effect, limiting the number of incoming connections that could be serviced. Netstat will likely show these connections in a FIN-WAIT2 state.
If you do see an inordinately high number of open connections, you should consult your documentation to see if the default value for TCP timeout can be changed. Be sure not to set it too low if the machine is to be used for other purposes.
Hopefully, this column has given you a few things to think about. Let me throw in a couple of things that transpired in the real world when putting the new servers online at WRQ. After we migrated our content and programming to the new box, we did in-house testing to establish benchmarks and to tune as needed. Although our intentions were good, this was a less than accurate method. We never exposed the server to the latency and connection difficulties that are inherent with the Internet. Accordingly, our benchmarks and tuning were subject to some rapid changes, to the point that we actually changed our server software to achieve the performance we needed.
Tuning is definitely a fine art that takes quite a while to get the hang of. If you have any additional insights or comments, feel free to reach me at firstname.lastname@example.org.
Reprinted from Web Developer� magazine, Vol. 3 No.2 Mar/Apr 1997 (c) 1997 internet.com Corporation. All rights reserved.