[Flac-dev] Synchronizing a streaming client to the server Was: Idea to possibly improve flac?
Brian Willoughby
brianw at sounds.wa.com
Fri Jan 7 16:36:30 PST 2011
This thread has raised several good topics. It's surprising that the
FLAC-Dev list has been silent for years, and now suddenly there are
several good ideas to discuss.
On Jan 7, 2011, at 15:04, David Richards wrote:
> I am interested in streaming lossless audio, FLAC is probably the best
> option for that. Currently the OggFLAC way of doing it mostly works
> with a few hacks in libflac and my version of edcast. It might be that
> the Ogg packaging layer is ill suited for this purpose, and an
> alternative model developed. I've seen that its possible to stream
> native flac with netcat, but thats not really the solution I'm looking
> for.
I have not done much work with streaming. I have written a lot of
serious code that uses the FLAC library. I remember that there used
to be separate objects in the FLAC library for streams, and they were
unique from the file objects because you can seek backwards in a
file, but you cannot seek backwards in a stream. For some reason, it
seems that these objects have been removed in the latest versions of
the FLAC library.
Can anyone explain the issues with streaming pure FLAC? What does
OggFLAC add to make streaming possible, or even easier than pure
FLAC? I thought that OggFLAC was just a way to put FLAC blocks into
the Ogg file format. Apple's CAF specification would also allow FLAC
blocks to be placed inside their file container, although this still
would not force iTunes to play FLAC unless a decoder were installed
in the system.
What is it about netcat that you don't like? Can you describe what
you're looking for, and why the specific details are important? I
was always under the impression that the FLAC format was already
designed for streaming, but I must admit that I've never studied the
issue.
> On Fri, Jan 7, 2011 at 5:58 PM, Tor-Einar Jarnbjo <tor-
> einar at jarnbjo.name> wrote:
>> Am 07.01.2011 23:38, schrieb David Richards:
>>> I'm also interested in another concept of lossless streaming with
>>> flac. Lets call it broadcast flac. A problem with streaming for long
>>> periods of time is that the sending and receiving computers
>>> clocks go
>>> out of sync, for example even if I stream myself on localhost, with
>>
>> This is not a FLAC specific problem, but has to be handled in all
>> situations
>> where the streaming server is in control of the transmitting data
>> rate. It's
>> caused by a playback device, which actual sample rate is slightly
>> higher
>> than the sample rate actually requested or a streaming source,
>> which system
>> clock is running slowly. Since these parameters (at least an exact
>> playback
>> sample rate) is hard to achieve, this is a rather common problem.
>> Or to
>> shorten it: If the data has a sample rate of 44100 and your sound
>> card
>> consumes more than 44100 samples per "sender-time" second, your
>> buffer will
>> eventually exhaust. If it's the other way around, your buffer may
>> overflow
>> if the client does not handle these cases properly.
>
> I am well aware its not flac specific, but such a standard way of
> handling such a matter could be part of the packaging for streaming
> flac.
I think that this would be a good opportunity to design a solution
that is specific to broadcast. At the sending end, the server should
have knowledge of when there are breaks in the content. If the
stream could send flags at these breaks, then the receiving client
could go silent and reset the synchronization. As you describe, the
situation only becomes a problem after long periods of time, but I
would guess that there are enough station breaks (or at least song
breaks) in a long broadcast that there would be a chance for a reset.
CoreAudio is a pull model, and the API provides a time line that can
be used to find the audio samples for a specific time. However,
there are many cases where this time line gets reset. Usually, each
callback has a time stamp that occurs precisely after the previous
callback. Obviously, the audio should not glitch when the time line
is contiguous, and thus the data must be sample-accurate. However,
CoreAudio code must also deal with situations where the time line
starts over from 0, usually under control of the host application.
CoreAudio also has a flag in the callback to indicate when the
buffers are totally silent. I'd like to borrow these ideas, or at
least similarly-inspired ideas, and have FLAC streaming designed such
that the stream can tell the playback software when to reset.
The typical process to deal with synchronization of separate systems
is sample rate conversion. However, this introduces distortion into
the audio, especially with real-time SRC. The only way to avoid SRC
is to have some way to reset the alignment without dropping or adding
samples. As I said above, if the broadcast server were to put flags
in the stream to indicate silent breaks in the audio, then the
playback client could drop silent samples or insert silent samples
until the two time lines are resynchronized. But, since this would
only add or remove silence, there should be absolutely no audible
glitch. Perhaps the stream would need more than simple silent flags,
or resync flags. It might be necessary to transmit an actual running
time line counter, with enough bits to count the longest stretch of
contiguously-clocked audio blocks. When the broadcast server sees a
break in the content material, the time code could be reset to zero,
and this would tell the client to start the sync over, thus avoiding
dropped samples in the middle of real audio content.
>>> Anyway what could happen is the client could do a little bit of
>>> re-sampling here or there to ensure its in sync with the servers
>>> clock.
>>
>> That is how streaming clients usually solve this problem, although
>> is not
>> really improving sound quality.
>
> Its probably not a big deal if you don't resample all the time, just
> when your off by X amount, all of this would just be client side
> preferences. As long as the client side "knows" its off by X amount
> you could handle it in any number of ways, I'd be fine if its just
> crossfaded to the correct timing if was off by more than half a
> second, then no resampling would ever happen, you would just get a
> weird effect about once an hour, better than a buffer underrun or lag,
> or perhaps the client could look for a half second of silence and just
> cut it out.
I don't think it's a good idea to resample just some of the time,
although your idea to crossfade would work since it never resamples.
I think that there are a number of PC-based digital audio playback
systems, and perhaps even in the television broadcast industry, where
this idea of intermittent resampling is done. I hear a regular
glitch in audio about once per second in many syndicated television
shows, and my suspicion is that they are speeding up the show so that
they can sell more commercial time. Another place that I hear this
glitching is in some of the PC audio software oriented for DJs which
can play MP3 files at different speeds and mix them together. I hear
the same sound - one glitch per second - and it is very annoying.
But, as you said, a crossfade once per hour would not be as bad.
Also, the stream could be completely resynchronized even without a
crossfade. Some streaming servers are so bad that they can't run for
hours without rebuffering, but I guess it's probably pretty lazy to
design something that does that on purpose (the rebuffering, that
is). However, as I suggested, it might be better if the broadcast
server gives hints so that the client player can do these crossfades
during the silence between tracks. Using my idea, you'd need to
"crossfade" more than once per hour, because there probably isn't
enough silence to handle it that seldom. But a fraction of a second
between tracks several times per hour would never be noticed, unless
there is a continuous audio broadcast with absolutely no silence.
Brian Willoughby
Sound Consulting
More information about the Flac-dev
mailing list