Short answer: don't use output sample rates other than 44100 or 48000.

Longer answer: Sound chips usually run at one of those rates, often either.
Those rates are more or less guaranteed to work properly.  Most chips don't
support other rates directly; a software resampler in the driver is used
instead.  Unfortunately, Microsoft released a horribly-broken reference
resampler implementation to sound hardware OEMs a few years ago, and many of
them still use it.  On their sound cards, if you ask for 11025 Hz, for
example you're likely to get 11100 Hz or something similarly-imprecise.
That obviously causes cumulative latency/slippage problems.

Bottom line: voice codec applications that need to work at lower rates
really need to resample to 44.1K or 48K themselves in order to work robustly
across all hardware platforms.  Neither MS nor sound-hardware OEMs have
shown the slightest interest in fixing this bug, so that's just the way it

