[opus] [EXTERNAL] Re: Submitting a patch that exposes VAD voiced/unvoiced signal type
jmvalin at jmvalin.ca
Fri Jun 16 18:27:01 UTC 2017
Can you say a little bit more about what you're doing exactly with the
information you're exposing and how? unfortunately, I don't have a
concrete proposal in mind right now. That's in part because I don't
quite understand the use case, but also because it's really hard to
expose this kind of information in a way that both avoids breaking
application with new versions and doesn't prevent future improvements to
On 08/06/17 08:20 AM, Freshman, Peter wrote:
> Hi Jean-Marc,
> Thank you for the valuable feedback. You're correct in that we focused
> on enabling this just for SILK. Because our solutions are focused on
> voice, we did not explore doing the same in CELT mode, but we can
> certainly look into the details of analysis.c.
> Regarding the concern of exposing internals, do you have a specific
> proposal in mind?
> We've been sharing this patch with our customers over the last several
> months, and the preference obviously would be to have it in the public
> domain. We're interested in any opportunity to accelerate this.
> *From:* Jean-Marc Valin <jmvalin at mozilla.com>
> *Sent:* Wednesday, June 7, 2017 2:46:52 AM
> *To:* Freshman, Peter; opus at xiph.org
> *Subject:* [EXTERNAL] Re: [opus] Submitting a patch that exposes VAD
> voiced/unvoiced signal type
> Hi Peter,
> There's two main issues with a patch like the one you're proposing.
> First, the data is only valid when SILK is being used and is essentially
> undefined in CELT mode. The second issue is that by exposing internals,
> it makes it impossible to improve these algorithms since it would break
> API compatibility. I'm not fundamentally against trying to expose some
> information, but there would have to be a way to address those two issues.
> On a slightly different topic, have you looked at the VAD probability
> that's computed in analysis.c (along with the speech/music probability)?
>> I'm reaching out because we'd like to contribute back to the project
>> a patch that exposes the signal type of the audio packet when
>> encoding the PCM audio to OPUS. We've found the Opus VAD algorithm to
>> be exceptional in this regard and have written a library that
>> leverages this information for audio end-pointing. Attached is the
>> patch. Please let us know if you'd be willing to accept it, or if
>> you'd prefer we fork libopus or recommend some other option.
> opus mailing list
> opus at xiph.org
More information about the opus