Produce Streaming Audio That Satisfies
By Jay Lorenzo
After a somewhat slow start, Web sites that are capable of delivering relatively low-bandwidth audio content are appearing with greater frequency, most likely in response to the increasing number of multimedia-capable PCs hooking into the Internet. The current offerings from some of the major suppliers of Internet audio software now include the ability to stream live audio across the Net, typically through 14.4 Kbps and 28.8 Kbps modems, which in turn has fueled the growth of Web "radio" programming and other real-time content.
There are a number of different approaches taken for Internet-based audio delivery. For more information on what's available, see "Does Multimedia Have a Dark Side?" on p. 24.
Server-based audio solutions are currently the only way to stream live audio on the Internet. Most people will find the installation of a server to be the least complicated component of delivering audio. The server install is somewhat similar to setting up a httpd server, using a stand-alone daemon and a configuration file that is read on initialization, which specifies the root location of the encoded audio files. In this column, we are going to focus on the process of encoding audio and delivering it from your Web site, using the RealAudio 2.0 server and audio tools as an example, which I recently tested for use on the WRQ Web Connection.
Consider Your Source
Regardless of the delivery mechanism used, all processes of delivering audio on the Web are basically similar. An audio source must be converted to a digital format which is then encoded; in the process the audio is compressed into a more manageable delivery size. When providing audio content for low-bandwidth encoding, it is crucial to provide as high a quality of source material as possible. It is important to realize that the compression used for encoding is lossy, meaning a great deal of information is removed from the original signal to obtain a particular compression. By providing a high quality source, the encoder will have more meaningful data to analyze, which helps produce a noticeably better end product.
You should consider minimizing the number of sound sources used in your recording. Background noise, in particular, should be reduced or eliminated whenever possible, through the use of close miking and noise gates. If you want to ensure that you will obtain the best audio performance, you should look into renting or buying professional audio gear. This includes the use of high-quality microphones and mixers, DAT recorders, audio compressors, noise gates, and equalizers, as well as high-quality computer audio interface equipment.
You may find a slight divergence from traditional recording techniques, as it is important to remember that you're working toward a particular final product. You will find in some circumstances that an audio source may need what initially sounds like excessive equalization or signal processing to provide the best audio quality after encoding, so be sure to experiment accordingly.
As an example, the audio source files I used for evaluating RealAudio at WRQ were originally recorded through an AKG condenser microphone and stored digitally on an Alesis ADAT digital multitrack recorder; arguably overkill for the speech-based content being tested. The ADAT's outputs were mixed down to a mono signal using a Mackie mixer, and processed through an Alesis Quadraverb 2 (which provided some compression and reverb), then transferred to our test encoder machines which consisted of a Sparc 5, a Windows NT PC, and a PowerMac.
Audio input is fairly straightforward for the Sparc and Mac, as they have the necessary hardware built into the box. The Windows NT box, in this case a Compaq Pentium based system, requires the use of a good quality sound card, in this case a Soundblaster AWE32.
The Sparc appears to be a good choice when planning to stream live audio, but the additional signal processing tools available on the Net for the Mac and Windows make those platforms more desirable solutions when you are encoding pre-existing files stored on disk. You may wish to take a look at CoolEdit95 and GoldWave if your encoding platform is Windows-based. The Mac also has numerous sound utilities to help you fine tune your audio content; is a good place to start your search. If you're looking for more professional tools, be sure to visit Digidesign's site.
Preprocess Before Encoding
When using pre-existing source, it is not uncommon to find digital audio files that are hundreds of megabytes or more in size. Be sure that you have sufficient hard disk capacity for both the source and final encoded audio content. Given the relatively low cost of hard drives, it is wise to consider a minimum of a gigabyte capacity to process your content with, if you are entertaining thoughts of hour-long audio files. If you are planning to archive your source material, a tape backup is essential.
When dealing with pre-existing audio, you should consider amplifying and normalizing your audio, which increases the volume or amplitude of your content and then averages the peaks to a point just below signal clipping. CoolEdit is particularly useful for this type of work. It is usually a good idea to normalize at 95 percent of threshold to make sure that the RealAudio Encoder does not choke on any audio peaks that approach the clipping threshold; digital audio that exceeds the 100 percent limit sounds incredibly bad.
Another reason to use digital audio editors on pre-recorded material is to correct an electrical effect known as "DC offset." This is a common signal problem that occurs with insufficiently grounded sound cards, which can result in a rumbling effect in the encoded audio. A good fix is to record silence, which should show a flat line at zero dB; then use your digital editor to adjust the "silent" signal's level to its proper setting.
Once you have finished preprocessing, the encoding process itself is easy. When using the RealAudio encoder, select the target bandwidth encoding that the source should be processed with. RealAudio servers have the ability to negotiate content delivery based on the RealAudio Player's setting, and deliver either a 14.4 Kbps or 28.8 Kbps bandwidth selection. Accordingly, this also means that you have to encode each source twice if you plan to offer users the choice of negotiated content delivery. There are still quite a few users that surf the Web using 14.4 modems, but the audio quality of 28.8 is noticeably better and should be offered if at all possible.
You then choose a source: a live stream of audio or an existing digitized audio file. The destination section of the Encoder Window allows you to save the encoded audio to a file with an .ra suffix, or to a RealAudio server which you specify in the destination settings. The current version of the encoder implements a compression algorithm that results in just 3.6 MB of file space for each hour of source audio, which allows a great deal of pre-recorded content to be placed on a typical server. The encoder also allows you the ability to embed information about the file that will be displayed by the RealAudio Player when playing the file. This is accomplished by encoding a text tag in that provides the title, author, and copyright of the material you are encoding.
Live audio streaming seems to be gaining some momentum in the Web community; the encoder is capable of streaming this content directly to the RealAudio server via TCP/IP for live events. This magic is accomplished through the use of a server-side application known as the Live Transfer Agent (LTA). The LTA resides on the same machine as the RealAudio Server. It takes the incoming stream and converts it on the fly, allowing the server to deliver the audio content. A similar program, called SLTA, allows the use of pre-recorded material to simulate the delivery of live audio. Note that it is necessary to use two encoders and LTAs simultaneously if you plan to deliver both 14.4 and 28.8 audio for a live event.
As I have previously mentioned, live audio is much more challenging. It is necessary to plan ahead, and try to anticipate the volume changes your recording will go through, as you need to provide the best audio signal that you can, while making sure not to overload the encoder with clipped audio, which usually can be reined in with compression. Do as much testing as possible before the live event occurs, and consider implementing a time-delayed broadcast if possible, which hopefully will give you some margin of error.
Producing usable audio can be a trying experience, particularly when you realize that the audio quality at best will be on par with a mono FM signal. That being said, properly-prepared audio can add a high degree of quality to the experience someone has visiting your site. It takes time and patience to produce good audio content. If you are interested in learning more, I have put a list of resources and starting points at audio.html As usual, feel free to send your comments and suggestions to me at email@example.com -- WD
Jay Lorenzo currently spends his time overseeing administration of the internal and external Web servers at Walker Richer & Quinn, Inc. He is a partner in Strings, a Seattle-based Internet consulting company.