[speex-dev] API suggestions

Christian Buchner Christian.Buchner1 at epost.de
Thu Jun 5 23:34:58 PDT 2003



> If I understand what you said correctly, there is such a call:
> speex_bits_read_whole_bytes, which adds a couple byte to the SpeexBits
> struct while removing the ones that have already been read.

Oh, how could I not have seen that one? That is exactly what I need.
Thanks for the pointer.

>> It is unusual to require a different sequence of API calls
>> depending on whether the signal is Mono or Stereo.
> I'll need to think about that one. Stereo support is really some "extra"
> information that the decoder can decide to use or ignore.

Well in that case I would tell the API: "force decode to Mono". The
default mode however would be "decode to Mono or Stereo as present
in the bitstream".

This doesn't really have a high priority (the more I think about it).

>> - verify the syntax of the bitstream
>> - verify that the input data has sufficient length for decoding
>>   a frame
> The problem is that this is not possible. The reason is that if you cut
> just before some optional extra information, you can't know if it's
> there or not.

I will probably use a termination code in the bitstream following each
group of frames (inband signalling). This would apply to VBR modes only.

>> And storing frame sizes between frames is a waste of bits in a
>> .WAV container ;)
> I don't think it's a good idea to simply concatenate all the frames
> together. For example, it makes it impossible to recover from corrupted
> streams and seeking becomes complicated. I would suggest doing something
> similar to what Ogg does. 

For CBR this works perfectly. The block size in the WAVE file is defined
by the nBlockAlign value in the WAVEFORMATEX header. These blocks are
of a constant size and facilitate seeking. By grouping a number of frames
together the padding loss becomes minimal (or zero).

Example: CBR Quality 4 Stereo yields 309 bits per frame. I group eight
frames into a block and get 8*309 bits = 309 bytes for the nBlockAlign
value. No bits are lost to padding.

I see WAV primarily as an archival format. I an not very concerned
about corruption there. All reads and decodes are performed at multiples
of nBlockAlign and you will never have a critical out-of-sync issue.

However ACM and WAV does generally not cope well with the concept of a
variable bitrate (the lack of a seek index and the nBlockAlign methodology
are an indication of that). I can try... but I am not sure if it works
reliably.

<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Speex-dev mailing list