[opus] Opus Major Version Benchmarks on Raspberry Pi
stuartmarsden at finmars.co.uk
Thu Dec 19 10:06:57 PST 2013
I wanted to roughly benchmark how the different version of libopus
performed at each complexity level for a 6kbit/s output opus file. This was
conducted on a Raspberry Pi so it is a constant hardware platform. This was
done on an early Pi so only 256MB RAM but it was never used up so should
not make a difference.
I compiled the three final versions of each major release of libopus so
that was 0.9.14, 1.0.3 & 1.1. These were all compiled natively on the
machine using the current repo version of gcc 4.6.3 and with these
-O2 -pipe -march=armv6j -mtune=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard
These were compiled with floating point enabled. I will look at the fixed
point version separately later.
I used a clip of speech from a librevox recording which was resampled from
44.1khz to 48khz within audacity. The clip is 2 minutes long. I wrote a
simple bash script that ran the encode at each complexity level and
repeated 10 times to try and get a good average.
The results can be seen in this graph
[image: Inline images 1]
This clearly shows what I had discovered in what appears to be a regression
for complexity 7,8,9 and 10. From what Gregory said earlier then in fact
this is because these levels are producing more quality than they did
before. It is still good to know this profile though if you only have a
little CPU to play with such as in embedded applications. The thing I
cannot graph on this is encode quality. My ears are not good enough to hear
the difference and unless there is an automated way to score it we will
just have to assume that each complexity level does increase the quality.
The graph also suggests that on this platform at least complexity level 9
is pointless as it was slower than 10 and presumably produces worse
results. This could of course have been some background task kicking in on
the OS when this ran and the error bars are quite large so I will see if
this maintained over other runs.
All these speeds were taken from opusenc outputs and I used version 0.1.2
of opus-tools which was compatible with all three versions of the library.
I am running another test using 0.1.8 at the moment but it will only work
with libopus 1.0.3 and 1.1. I think I observed that it was slightly slower
but we will see if the results will bear that out.
I also will run some tests at different bit rates and with music instead of
voice as well and share the charts here. If anyone wants I can share the
OpenOffice spreadsheet with the raw numbers and the bash script I used
(though you have to do all the compiling yourself).
Hope this is helpful.
Tactical Communications Consultant
FinMars Consulting Ltd
On 18 December 2013 00:14, Stuart Marsden <stuartmarsden at finmars.co.uk>wrote:
> That is good to know and if therefore the true apples to apples comparison
> is 0.9.14 at comp 10 and 1.1 at comp 5 then things are fine. My ears are
> not good enough to hear the difference so for speed I would target comp 5
> or lower.
> I just did a quick test and 0.9.14 at comp 10 was 3.872
> 1.1 at comp 5 was 5.218
> So if the output is comparable then we do in fact see a speed improvement.
> Thanks for pointing this out. Is it documented? I admit I have only read
> some of the documentation.
> Best Regards,
> Stuart Marsden
> On 17 December 2013 23:50, Gregory Maxwell <gmaxwell at gmail.com> wrote:
>> On Mon, Dec 16, 2013 at 5:03 AM, Stuart Marsden
>> <stuartmarsden at finmars.co.uk> wrote:
>> > I have just started trying Opus with a view to using it in a project. I
>> > interested in embedded hardware and tried it on the Raspberry Pi using
>> > raspbian distro.
>> > The version of libopus in the repos is 0.9.14. I installed this and
>> > encoding 2 minutes of speech from a librevox recording. It managed this
>> at a
>> > respectable pace for complexity 10:
>> Complexity 10 is new analysis code that didn't exist in prior
>> versions, setting complexity 5 gets you basically the same analysis
>> that the 1.0 version had.
>> On x86 and modern arm cores with fast FPUs the other speedups are
>> enough that complexity 10 is about the same speed in the old software
>> or the new software (but with much higher and consistent VBR quality).
>> But on chips with slow FPUs the new analysis code is much slower, in
>> particular because it has not been entirely converted to fixed point
>> (e.g. in the fixed point builds) which is what I believe you're seeing
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 41961 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/opus/attachments/20131219/6320e394/attachment-0001.png
More information about the opus