[opus] Gapless concatenation of Opus frames

Thu Nov 16 01:25:48 UTC 2017

Hi Andreas,

So I encoded your file in chunks with a slightly modified version of
opusenc_example and I can't hear anything wrong. Maybe there's a problem
in the tools you used? I uploaded the files at:

https://jmvalin.ca/misc_stuff/continuous.opus (one file)
https://jmvalin.ca/misc_stuff/continuous.wav (one file, decoded)
https://jmvalin.ca/misc_stuff/chained.opus (many small files)
https://jmvalin.ca/misc_stuff/chained.wav (many small files, decoded)

Can you hear any of the glitches you mentioned in continuous.wav? If
there's indeed an issue, it can likely be fixed by just adding a small
amount of redundancy to libopusenc. There's no fundamental reason it
can't be perfectly glitchless.

Cheers,

	Jean-Marc

On 11/15/2017 03:00 AM, Andreas Stöckel wrote:
> Hi Jean-Marc (and everyone else who replied),
> 
>> Considering you're switching to Ogg, I think you should give libopusenc> a try. It does a really good job at getting rid of *all*
> discontinuities> -- to the point where you can chop a song into files
> less than one> millisecond each and it still sounds good. It's also
> pretty simple to> use. You just feed it audio and tell it where the
> file boundaries are.
> 
> thank you for pointing me at libopusenc. I had a look at the source
> code and liked the idea of using Linear Predictive Coding for the
> generation of the lead-in/lead-out frame. This avoids some
> high-frequency content that my mirroring technique produced. I
> C++ified the corresponding 1994 LPC code and implanted it into my
> program [1]. Works like a charm.
> 
> Since my program seemed to work fairly well I was doing some extended
> tests and found one particular case where it still produces audible
> artifacts.
> 
> Unfortunately, libopusenc with ope_encoder_continue_new_file (see [2]
> for my code) produces similar (though not the same) audible artifacts.
> The affected audio file has very low frequency content (produced by a
> Taiko).
> 
> In my program the low frequency content seems to be phase shifted,
> producing a discontinuity while transitioning OGG files [3].
> 
> Libopusenc seems to introduce ringing artifacts [4], resulting in a
> similar, yet not that pronounced clicking noise. (Maybe the ringing
> stems from no "lead-in" frame being used is used -- in my program I do
> a reverse LPC at the begining of the first audio chunk to create an
> artificial frame that leads up to the first frame [7]).
> 
> You can reproduce the libopusenc problem by compiling my adapted
> opusenc_example.c [2] and feeding in a segment of the affected RAW
> audio as indicated at the beginning of my source code. The RAW can be
> downloaded here [5] (48000 Hz, stereo, 16-bit signed, little endian;
> the complete song can be downloaded here [6]).
> 
> Any idea how any of the two issues (either in libopusenc or in my
> program) might be solved?
> 
> 
> Again, thank you for your help!
> 
> Cheers,
> Andreas
> 
> 
> [1] https://github.com/astoeckel/opus_gapless/blob/master/lpc.cpp
> 
> [2] https://gist.github.com/astoeckel/6731bc846a2f70dd7f5e155e75683fae
> 
> [3] https://somweyr.de/opus/click_opus_gapless.png
> 
> [4] https://somweyr.de/opus/click_libopusenc.png
> 
> [5]
> https://somweyr.de/opus/test_libopusenc_ope_encoder_continue_new_file.raw.bz2
> 
> [6] https://www.youtube.com/watch?v=z64HCi2rQkE
> 
> [7]
> https://github.com/astoeckel/opus_gapless/blob/master/opus_gapless.cpp#L82
> 
>>
>> Cheers,
>>
>> 	Jean-Marc
>>
>> On 11/13/2017 04:16 PM, Andreas Stöckel wrote:
>>> Hi Jean-Mark,
>>>
>>> thank you for your answer!
>>>
>>> Yes, you understood my question correctly. I was just about to compose
>>> a reply to my original question, where I described how I solved my
>>> problem. As you've already suggested, I've switched to Ogg/Opus, which
>>> is better supported, but does not work with the Media Source Extensions.
>>>
>>> I'll have a look whether disabling prediction will help with the
>>> transitioning phase, but I think the way I'm implementing it right now
>>> it probably won't.
>>>
>>> So here is what I was going to write originally:
>>>
>>> When I wrote the question, I wasn't really aware of the pre-skip
>>> (CodecDelay in WebM) and DiscardPadding [1]. However, these properties
>>> can only be set on a per-stream basis, and not on independent
>>> sequences of WebM packets. As a consequence of my ignorance regarding
>>> pre-skip, I also didn't append an additional frame to the audio such
>>> that 6.5ms lost due to the pre-skip couldn't be recovered when
>>> decoding. As an additional complication with WebM, there is also no
>>> way to indicate in a WebM stream that the decoder should reset. So if
>>> anything, we can only concatenate entire files/streams, and not on a
>>> per-packet basis.
>>>
>>> However, playing back individual WebM streams with CodecDelay and
>>> DiscardPadding set (and an additional lead-out frame) did not work,
>>> since CodecDelay/DiscardPadding were only insufficiently interpreted
>>> by Chromium/Firefox and even ffmpeg. There is a method for gapless
>>> concatenation of entire files using MSE, described here [2], but this
>>> didn't work for Firefox and still produced audible artifacts on Chrome.
>>>
>>>
>>> Well, the way I'm solving the problem now is the following:
>>>
>>> First, I've switched to Ogg/Opus. Second, I'm appending a reversed
>>> version of the first/last 20ms to the beginning/end of the audio chunk
>>> I'm encoding. This reduces ringing artifacts from the transient at the
>>> beginning/end of the chunk. I then set pre-skip and the granule of the
>>> last packet in the generated Ogg stream in such a way, that the
>>> relevant audio information is "cut out". In contrast to WebM, browsers
>>> (and ffmpeg) actually correctly interpret this meta-information in an
>>> Ogg container. However, browsers do not support Ogg in conjunction
>>> with the Media Source Extensions. Thus, I've ditched MSE and I am now
>>> decoding the individual chunks with the WebAudio API and schedule
>>> gapless playback of the chunks (which is not optimal, since WebAudio
>>> is rather finicky).
>>>
>>> The working implementation can be found here [3]. Since Ogg is so much
>>> simpler than WebM I also wrote my own minimal C++ Ogg/Opus muxer,
>>> which shaves off another dependency of my application.
>>>
>>>
>>> Thank you for your help,
>>> Andreas
>>>
>>>
>>>
>>> [1] https://wiki.xiph.org/MatroskaOpus
>>>
>>> [2]
>>> https://developers.google.com/web/fundamentals/media/mse/seamless-playback
>>>
>>> [3] https://github.com/astoeckel/opus_gapless
>>>
>>> On 2017-11-13 03:42 PM, Jean-Marc Valin wrote:
>>>> Hi Andreas,
>>>>
>>>> So if I understand your question correctly, what you want is really
>>>> short "files" that are independent, but yet create a glitchless stream
>>>> when concatenated, right. For Ogg, this can be implemented with
>>>> libopusenc and chaining. It works pretty well (even for really tiny
>>>> files). For WebM, I'm not sure how to handle the details at the
>>>> container level, but for how to handle the transition details (reset and
>>>> all), I suggest you have a look at the libopusenc code. In general, the
>>>> idea is to disable the prediction at the point of the transition between
>>>> two files and to include the transition frames in both files.
>>>>
>>>> Cheers,
>>>>
>>>> 	Jean-Marc
>>>>
>>>> On 11/08/2017 03:43 AM, Andreas Stöckel wrote:
>>>>> Hi!
>>>>>
>>>>> Short version of my question: How to produce Opus frames which can be
>>>>> safely concatenated and how to embed them into a WebM file?
>>>>>
>>>>> Long version:
>>>>>
>>>>> I'm currently implementing a web-based audio player which streams
>>>>> audio as opus/WebM using the HTML5 media source extensions. Currently,
>>>>> the server decodes a set of input files to a fixed RAW audio format
>>>>> (stereo, 48000 kHz) and encodes the resulting continuous RAW stream as
>>>>> Opus/WebM. Having a single, uninterrupted RAW stream allows for
>>>>> perfect gapless playback on the client (which only sees a single live
>>>>> WebM stream), e.g. there are no interruptions whatsoever when
>>>>> transitioning between continuous tracks from the same music album.
>>>>>
>>>>> An early tech-demo of the technique can be found here [1], the source
>>>>> file http_audio_server/encoder.cpp implements the relevant
>>>>> opus-encoding and webm-encapsulation (but see also [2] for a condensed
>>>>> version).
>>>>>
>>>>>
>>>>> Now, for performance reasons I'd like to split my RAW audio into
>>>>> independent blocks (say, as an example, 50 frames or 1s each), encode
>>>>> these as raw Opus frames and cache them on disc ahead of time. For
>>>>> each block I'd like to reset the encoder to ensure independence
>>>>> between the first frame of each block and the last frames in the
>>>>> previous block, e.g., using
>>>>>
>>>>> opus_encoder_ctl(enc_ctx, OPUS_RESET_STATE)
>>>>>
>>>>> When the client requests a certain sequence of blocks (which may
>>>>> originate from various input files in (let's pretend) any order) my
>>>>> goal is to (on-demand) encapsulate the pre-encoded frames as WebM and
>>>>> send them to the client.
>>>>>
>>>>> However, in early experiments [2], resetting the encoder state at the
>>>>> beginning of each block and then concatenating the frames in the WebM
>>>>> container leads to clearly audible gaps in the decoded WebM stream
>>>>> whenever the opus encoder has been reset.
>>>>>
>>>>> Interestingly, such artifacts are far less pronounced (if they exist
>>>>> at all), if I don't explicitly reset the encoder. However, in my real
>>>>> application the encoder will at least be reset implicitly (e.g. by
>>>>> starting the encoding process in multiple threads for two files which
>>>>> may be played consecutively).
>>>>>
>>>>> See [2] for a MWE which expresses what I've tried to describe above.
>>>>>
>>>>> So to rephrase my question: if it is possible at all, how can I
>>>>> independently pre-encode blocks of Opus audio frames, such that I can
>>>>> concatenate them during WebM muxing without audible glitches?
>>>>>
>>>>>
>>>>> In advance, thank you for your help. Please let me know I anything I
>>>>> wrote is unclear, or you need more information to answer my question.
>>>>>
>>>>>
>>>>> Andreas
>>>>>
>>>>>
>>>>> [1] https://github.com/astoeckel/http_audio_server/
>>>>> [2] https://github.com/astoeckel/opus_gapless_webm/
>>>>> _______________________________________________
>>>>> opus mailing list
>>>>> opus at xiph.org
>>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>>
>>>
>