[opus] Opus for ASR

Tue Sep 18 11:42:06 PDT 2012

Young, Milan wrote:
> [Milan] Great, so what do we (or I) need to accomplish to define
> '.asr' in the oggopus draft?  Do we first need to show that a modified
> encoder can do better than a vanilla, or is it OK to put it there as a
> placeholder now?

Write up some proposed text and send it to codec at ietf.org. I'd recommend 
also adding ".speech" and ".audio" modes, to cover the other 
"application" modes libopus currently supports (I'd leave out one for 
restricted low-delay, as that might encourage people to use it to 
recognize files that only use the MDCT modes, and implement partial 
decoders that only support such files... I don't think that mode is 
relevant when you're talking about a muxed file instead of an RTP 
stream, anyway).

I think a placeholder ".asr" mode is fine for now. Encoders which don't 
recognize it can simply take it as a synonym for ".speech".

Since mime types a bit outside the domain of the actual codec, it'd be 
nice to get some broader review on this type of proposal as well 
(_before_ we get to IETF Last Call). I'm not sure where the best place 
to get that would be. I'll ask and get back to you.

> One less important follow-up question.  Why is bitrate a poor choice
> in codec parameter?  I would have guessed that the application layer
> may want to advise the host on the desired bandwidth.

I just don't think that is what a mime type is for. We don't have an 
audio recording API specified yet, but if you look at 
HTMLCanvasElement.toDataURL() as a model (see 
<http://www.w3.org/TR/html5/the-canvas-element.html#dom-canvas-todataurl>), 
you can see that it takes a mime type and a "quality" parameter between 
0.0 and 1.0. So quality/bitrate is still kept separate from the type.

> [Milan] Agreed.  Sorry for not making this point clear earlier.

No worries, I just wanted to make sure I understood you correctly.