[Xiph-Advocacy] example of lossless fade out

Gregory Maxwell gmaxwell at gmail.com
Tue Jul 7 14:41:08 PDT 2009


On Tue, Jul 7, 2009 at 5:18 PM, Tor-Einar Jarnbjo<tor-einar at jarnbjo.name> wrote:
> A lossless "perfect" fade-out is probably possible with MP3, since it
> works with a fixed frame length. I'm not very familiar with the MP3
> internals, but I suppose it is possible to attenuate each frame with a
> fixed amount and since the frames are faded into eachother with complete
> overlaps (the second half of the first frame is blended into the first
[snip]
> amplitude for a very brief period. For normal encoding parameters, we
> are just talking about a few ms here, so I _assume_ at least that this
> "fade artifact" is not audible, but I am really not sure.

So, not quite.  The reason this is commonly done for MP3 is that the
encoding makes it dead simple. It's pretty much just a bit offset into
the frame that you need to twiddle.  For replaygain its not completely
lossless as advertised because the global scale factor has a finite
dynamic range and you can bump into it.

In vorbis you could, as you described, change the floor.  But this
isn't just a matter of changing a single value, it is far more
complicated and requires disassembling and reassembling most of the
frame.

The overlap isn't an issue except vorbis has much longer frames, so
the amplitude change is going to be more quantized.  In both MP3 and
Vorbis cases simply twiddling the global block gain from frame to
frame IS not the same as a true time domain fade out because the
frames are *not* merely overlapped, they are overlapped and critically
sampled.  The overlap isn't an overlap in the sense that you're
thinking where a true linear gain becomes possible because the codecs
are taking  2N samples and outputting N DCT-IV values, so the overlap
is actually required to create lossless reconstruction.

The changing gain from frame to frame will produce some weak
side-tones because of the time domain alias cancellation, but it will
still sound OKAY.

In practice simply re-encoding the frames of the fadeout would also
sound okay. It would take more CPU, but it would probably require less
coding to do it.


More information about the Advocacy mailing list