Open Sound System
|Do you have problems with sound/audio application development? Don't panic! Click here for help!|
There are several myths and urban legends related with audio. Unfortunately many programmers think these myths are correct and then try to find workarounds to problems that simply don't exists.
"Blocking" refers to the automatic wait behaviour of the read and write system calls. It's known that in some cases this blocking prevents the application from serving keyboard, mouse and other events quickly enough. The end user sees this as general unresponsiveness and sluggedness. For this reason blocking must be avoided like hell. Sometimes this is OK but in worst cases this results in applications that use select/poll, usleep and/or various ioctl calls to wait until it's possible to write to the device without blocking.
This is completely unnecessary. The read and write system calls of OSS will automatically wait in optimal way if there is not enough space available in the buffer. There is no way to do this better in the application.
Some programmers try to avoid blocking because it may make the application to feel sluggish and unresponsive. This is true in some cases but in general the workaround to this is writing smaller amounts of audio data each time. See the Audio timing considerations section for more info.
The reason to this myth has been some buggy freeware OSS implementations that by default wait until the device is available. This has resulted in unbearable delays when some other application is using the device. For this reason practically all applications have used O_NONBLOCK to get sane behaviour. Fortunately this bug has been fixed in the freeware implementations and there is no reason to use O_NONBLOCK any more. In particular no OSS 4.0 compatible drivers need it.
The reason why using O_NONBLOCK is that it turns on so called non-blocking I/O feature which affects read/write calls too. This is not what most applications want. Non-blocking mode can be later turned off by using fcntl() but most programmer's don't know that.
Non-blocking I/O requires special handling of some error code returned by the read and write calls. The application needs also be prepared to handle partial reads and writes which justa adds complexity to the application. For this reason non-blocking I/O is not recommended. OSS provides several much easier methods for getting the same results. Please see the Audio timing considerations section for more info.
Novice audio programmers often ask questions like: "What is the lowest latency I can get with this device (or driver)?". Actually this is a wrong question. The right questions are "How small latencies do I need?" and "Can this device give me latencies like this?".
Handling latencies is not a maximizing problem. It's an optimization problem. Programmers who don't understand what is the difference between maximizing and optimizing should probably avoid writing programs that require low latencies. Optimal latencies mean the level where the user doesn't notice that something is wrong. Anything beyond that is mostly just waste of human and natural resources.
After you have figured out the latencies required for your application you will probably find out that instead of low latencies (milliseconds) you actually need normal latencies (10's of milliseconds). So instead of wasting time on using some state of the art programming techniques you can spend all your time in getting the actual application code tested and debugged within the schedule.
If you maximize for latencies instead of optimizing that means that the interrupt rate of the device will raise significantly. This will cause unnecessary overhead in the system. In the extreme cases this may make your application to run out of CPU resources.
There are devices that can't meet certain latency requirements. However the right policy is not to reject such devices. It might be acceptable in applications that don't work at all if the latency requirements are not met. However consumer oriented applications such as games should work with whatever device the customer appears to have (in most cases the user will not be willing to buy another device to make your application happy).
When really low latencies (few samples) are needed then you will need to use special hardware and special real-time operating systems. Such latencies will not be possible with typical PC sound cards (not even professional ones) or general purpose operating systems.
This is not entirely correct. The worst case latencies caused by any device driver are microseconds while the latencies needed for audio are in milliseconds. That means the driver latencies are just about 1/1000th of the required.
The latencies caused by the devices are usually slightly longer that the driver latencies. Most (PCI) devices have a FIFO of up to 64 samples. Devices based on USB have latencies of one or more milliseconds. This is something that even advanced drivers cannot fix.
The most significant latencies are caused by the other applications and devices being used at the same time with your application. In worst case this kind of latencies may be as high as 500 ms. This depends on the operating system and it's version. Also some configuration parameters or installed patches may improve the situation. However no sound driver can improve the situation.
There are some cases where this may be important. However large majority of OSS applications that set the fragment size do that only because the programmer has seen some other application doing it. In reality selecting the fragment size manually is not necessary in all but just very few applications. Please see the Audio timing considerations section for more info.
OSS is designed to select the right fragment size for most purposes. In most cases the fragment sizes selected manually in the application is much worse. It's likely to cause increased CPU usage and some other problems that don't happen with fragment sizes selected by OSS.
So instead of even thinking about the fragment size the programmer should let OSS to select it freely. It's only necessary to think about fragment sizes if problems are detected while testing the application.
When OSS was originally developed in 1992 the world was very different than now. The usual PC systems were based on 50 Mhz or slower 486 processors or 60/66/90 MHz Pentiums. It was not possible to even dream about things than MP3 playback. Hard disks were slow and even recording of .wav files directly to disk were causing problems.
However in today the computer systems are about 1000 times (or more) faster than in early 90's. It's very hard to imagine applications where extreme optimizations are needed. It may be necessary in applications such as multi track (tens or more of tracks) hard disk recorders or high quality music synthesizers or effect processors. However ordinary applications will use something like 1% of the available CPU resources for audio processing so optimizing it may not be worth of money. However it can be done just for fun.
Remenber that in time syncronous real-time applications like audio, video playback/recording, computer games and many others it doesn't matter if the application uses 0.1%, 1% or 10% of extra CPU time as long as it doesn't push the total system load above something like 80%. Until that the extra CPU cycles needed by the application just mean that the CPU doesn't need to wait in some iddle loop during that time. The application itself will not run any faster even if you spend ten years and rewrite everything in carefully hand optimized assembly code. Nobody will simply notice anything.
The above doesn't mean that you must not care about code performance at all. This is not the case. You just should understand that optimizations are not the first thing you should think about. It's usually more important to use less ccomplicated algorithms if that means you can make your code more reliable or to add some usefull features that are not possible with the optimal algorithms.
Note that the above is true only for time synchronous applications. There are mayny other kinds of applications where CPU usage is the bottleneck.
The amount of CPU time spent in the write and read system calls is couple of microseconds or less. This is all that can be optimized by using mmap. However the price is that mmap bypasses many OSS features that are performed by the read and write calls. For example any kind of format conversions (sample rate, sample size, endianess, etc) made by OSS will not be available when mmap is used. In some cases this also means that software based VU/peak meters in the contol panel don't show any signal. So you should consider this before saving few microseconds by using mmap.
It's true that the CPU load caused by read/write will raise if the fragment size is made smaller whic in turn means that read/write must be called more often. For example if the fragment size is 1 ms there are 1000 reads/writes per second. This means that the write overhead will be in order of milliseconds. However even this gives just less than 1% of extra system load. Also there will be 1000 context switches/second that cannot be avoided even if you use mmap.
Using real-time or linear priorities (or whatever they are called) is necessary in some applications that need very low latencies. However the cases when such features are necessary are rather rare. You should first think if you really need that low latencies. There are some risks in using high priorities. For example if such application starts malfunctioning it may be very difficult to stop it without resetting or powering off the computer.
OSS has the virtual mixer driver that does "mixing" in kernel space instead of using user space mixing as some competitors do. This is "known" to be evil. For this reason some programmers are scared about it.
Fortunately this is just an urban legend. Mixing done in driver level is in no way slower than mixing in user space. In fact user space mixing requires passing the data between the client applications and the mixer task. This causes couple of unnecessary context switches among other. In addition the mixing operation itself is so quick operation that it doesn't cause anything noticeable CPU load anyway.
Some novice programmers seem to believe that they should add support for as many hardware features as possible to their application. Unfortunately this is dangerous because all devices are different. For this reason OSS is designed so that programmer's don't need to care about the hardware features at all. The application just tells what it wants and OSS will take care of the rest. For example this means that the application should usually not check what kind of sample formats or sample rates the device supports. Instead it just tells OSS what kind of audio stream it has. OSS then takes care of converting the stream to a format that is supported by the device. In this way the application is guaranteed to work with any device, including possible future devices that use some format that has not been invented yet.
The OSS package contains mixer and control panel programs for setting all the device features. It is possible to the applications to change them themselves. However this is not recommended because all devices are different. Having all applications in the world to have support all possible devices (past, current and future) would be massive waste of time.
This may sound a nice idea to include "full" mixer in every audio player. However this is certainly not a good idea. Applications such as audio/media players are primarily audio players, not mixers. Mixers are all different and some of devices don't have mixers at all. It doesn't make much sense to have a volume slider in the application if it's not functional with large number of devices. In addition there is a risk that the volume slider changes wrong volume. The same is true with using the mixer API for controlling recording volumes or selecting the recording source.
We have included some new ioctl calls to the OSS 4.0 audio API. So instead of using the mixer API it's possible to the audio applications to do these common changes very easily. However even then it's important to understand that some (professional) devices don't have any volume control or recording sources to select Please see the Audio input and output volumes and routings section for more info.