[opus] Benchmarks on Pi

Jean-Marc Valin jmvalin at jmvalin.ca
Sun Dec 22 09:40:49 PST 2013


I have to admit that I am impressed by your results -- making 1.1 look
slower than 1.0 is by no means an easy task. On the other hand, it's a
great tutorial on how not to use Opus, so for the benefit of everyone,
this is a summary of what we learned in this exercise:

1) When running on ARM, the fixed-point build is usually faster than
floating point. This is true on the majority of ARM archs (on x86, float
is faster).
2) Turning on assembly is useful too, though that requires 1) first
(writing float assembly on ARM would have been pointless).
3) Complexity settings cannot be directly compared between versions
because we keep adding things. For example, the Opus 1.1 encoder at
complexity 5 does more advanced analysis (and has higher quality) than
the 1.0 encoder at complexity 10.
4) Using 44.1 kHz is a bad idea for benchmarking since it also counts
the cost of the resampling done in opus-tools.
5) Using different opus-tools version just adds to the randomness.

	Jean-Marc


On 12/21/2013 09:43 AM, Stuart Marsden wrote:
> I have run a few more test at different bitrates and 1.1 is looking even
> worse in terms of speed compared to previous versions.
> 
> I have shared a google sheet which has the raw data and charts for 6,16
> and 32 kbps. Unfortunately you cannot show proper error bars on Google
> sheets but the standard deviation is in the data if you want to look.
> You can see that the profile for 1.1 is a lot different from 0.9.14 and
> 1.0.3. 
> 
> I will probably do another run at 64kbps and then find a music sample
> and repeat the operation.
> 
> I then will re-compile as fixed point and see what that looks like.
> 
> Link for the spreadsheet is
> here: https://docs.google.com/spreadsheet/ccc?key=0AmUOhhaYBtrKdEJaUGFmZkxzaUE0VVdZRUtJRU16bnc&usp=sharing 
> <https://docs.google.com/spreadsheet/ccc?key=0AmUOhhaYBtrKdEJaUGFmZkxzaUE0VVdZRUtJRU16bnc&usp=sharing>
> 
> Best Regards,
> 
> Stuart Marsden
> 
> Tactical Communications Consultant
> FinMars Consulting Ltd
> UK: +441865589833
> Finland: +358453046287
> 
> 
> On 20 December 2013 18:24, Stuart Marsden <stuartmarsden at finmars.co.uk
> <mailto:stuartmarsden at finmars.co.uk>> wrote:
> 
>     Cliff,
> 
>     Yes it would be good, but very hard to get a figure for the quality.
> 
>     At 6kbps I assume it does not bother trying to figure what mode to
>     use as at that rate it can only use SILK. When I run some other
>     bitrates it may get a bit slower trying to decide whether it is
>     voice or music.
> 
>     I started with low bit rate because I am only really interested in
>     Voice and very low bit rate.
> 
>     I think there are so many variables it is hard to get that useful
>     metrics. For instance I cannot really hear the difference between
>     complexity 0 or 10 on this sample. It may be however that 10 would
>     be much better with a poorer input from a cheap microphone with lots
>     of background noise. I also have yet to look at the lost packet
>     tolerance is that affected by the complexity? For realtime
>     applications on most hardware it seems you could just go for the
>     default complexity 10 and hope for the best. For low power devices
>     or microcontrollers however the speed difference could be crucial.
> 
>     At the moment on the Pi, which is admittedly quite an old ARM
>     architecture, the promised speed boost for 1.1 on ARM is not
>     present. Unless you compare old complexity 10 with new complexity 5
>     which I understand may be equivalent. So it is just something to be
>     aware of.
> 
>     Best Regards,
> 
>     Stuart Marsden
> 
>     Tactical Communications Consultant
>     FinMars Consulting Ltd
>     UK: +441865589833 <tel:%2B441865589833>
>     Finland: +358453046287 <tel:%2B358453046287>
> 
> 
>     On 20 December 2013 17:20, Cliff Parris <cliff at espico.co.uk
>     <mailto:cliff at espico.co.uk>> wrote:
> 
>         Hi All,
> 
>         What would be interesting would be a plot of complexity versus
>         subjective or
>         object audio quality.
> 
>         I've not had a chance to look at the new analysis code in 1.1 so
>         maybe in
>         the case of a 6kbps compression you could clarify what decisions
>         would it be
>         making that would justify the extra complexity?
> 
>         Best Regards
> 
>         Cliff Parris
> 
>         -----Original Message-----
>         From: opus-request at xiph.org <mailto:opus-request at xiph.org>
>         Sent: Thursday, December 19, 2013 6:07 PM
>         To: opus at xiph.org <mailto:opus at xiph.org>
>         Subject: opus Digest, Vol 59, Issue 21
> 
>         Send opus mailing list submissions to
>         opus at xiph.org <mailto:opus at xiph.org>
> 
>         To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.xiph.org/mailman/listinfo/opus
>         or, via email, send a message with subject or body 'help' to
>         opus-request at xiph.org <mailto:opus-request at xiph.org>
> 
>         You can reach the person managing the list at
>         opus-owner at xiph.org <mailto:opus-owner at xiph.org>
> 
>         When replying, please edit your Subject line so it is more specific
>         than "Re: Contents of opus digest..."
> 
> 
>         Today's Topics:
> 
>            1. Opus Major Version Benchmarks on Raspberry Pi (Stuart Marsden)
> 
> 
>         ----------------------------------------------------------------------
> 
>         Message: 1
>         Date: Thu, 19 Dec 2013 20:06:57 +0200
>         From: Stuart Marsden <stuartmarsden at finmars.co.uk
>         <mailto:stuartmarsden at finmars.co.uk>>
>         Subject: [opus] Opus Major Version Benchmarks on Raspberry Pi
>         To: "opus at xiph.org <mailto:opus at xiph.org>" <opus at xiph.org
>         <mailto:opus at xiph.org>>
>         Cc: Gregory Maxwell <gmaxwell at gmail.com <mailto:gmaxwell at gmail.com>>
>         Message-ID:
>         <CALPi7JckeXBzKuE2M4iG5iH91M=joj6uNO=RsTVb+qt04mKsQw at mail.gmail.com
>         <mailto:RsTVb%2Bqt04mKsQw at mail.gmail.com>>
>         Content-Type: text/plain; charset="iso-8859-1"
> 
>         I wanted to roughly benchmark how the different version of libopus
>         performed at each complexity level for a 6kbit/s output opus
>         file. This was
>         conducted on a Raspberry Pi so it is a constant hardware
>         platform. This was
>         done on an early Pi so only 256MB RAM but it was never used up
>         so should
>         not make a difference.
> 
>         I compiled the three final versions of each major release of
>         libopus so
>         that was 0.9.14, 1.0.3 & 1.1. These were all compiled natively
>         on the
>         machine using the current repo version of gcc 4.6.3 and with these
>         optimisation flags:
> 
>         -O2 -pipe -march=armv6j -mtune=arm1176jzf-s -mfpu=vfp
>         -mfloat-abi=hard
> 
> 
>         These were compiled with floating point enabled. I will look at
>         the fixed
>         point version separately later.
> 
>         I used a clip of speech from a librevox recording which was
>         resampled from
>         44.1khz to 48khz within audacity. The clip is 2 minutes long. I
>         wrote a
>         simple bash script that ran the encode at each complexity level and
>         repeated 10 times to try and get a good average.
> 
>         The results can be seen in this graph
>         http://ubuntuone.com/2gOdUG3h3MyjLY7gSYseRN
> 
>         [image: Inline images 1]
> 
>         This clearly shows what I had discovered in what appears to be a
>         regression
>         for complexity 7,8,9 and 10. From what Gregory said earlier then
>         in fact
>         this is because these levels are producing more quality than
>         they did
>         before. It is still good to know this profile though if you only
>         have a
>         little CPU to play with such as in embedded applications. The
>         thing I
>         cannot graph on this is encode quality. My ears are not good
>         enough to hear
>         the difference and unless there is an automated way to score it
>         we will
>         just have to assume that each complexity level does increase the
>         quality.
> 
>         The graph also suggests that on this platform at least
>         complexity level 9
>         is pointless as it was slower than 10 and presumably produces worse
>         results. This could of course have been some background task
>         kicking in on
>         the OS when this ran and the error bars are quite large so I
>         will see if
>         this maintained over other runs.
> 
>         All these speeds were taken from opusenc outputs and I used
>         version 0.1.2
>         of opus-tools which was compatible with all three versions of
>         the library.
>         I am running another test using 0.1.8 at the moment but it will
>         only work
>         with libopus 1.0.3 and 1.1. I think I observed that it was
>         slightly slower
>         but we will see if the results will bear that out.
> 
>         I also will run some tests at different bit rates and with music
>         instead of
>         voice as well and share the charts here. If anyone wants I can
>         share the
>         OpenOffice spreadsheet with the raw numbers and the bash script
>         I used
>         (though you have to do all the compiling yourself).
> 
>         Hope this is helpful.
> 
>         Best Regards,
> 
>         Stuart Marsden
> 
>         Tactical Communications Consultant
>         FinMars Consulting Ltd
>         UK: +441865589833 <tel:%2B441865589833>
>         Finland: +358453046287 <tel:%2B358453046287>
> 
> 
>         On 18 December 2013 00:14, Stuart Marsden
>         <stuartmarsden at finmars.co.uk
>         <mailto:stuartmarsden at finmars.co.uk>>wrote:
> 
>         > Gregory,
>         >
>         > That is good to know and if therefore the true apples to
>         apples comparison
>         > is 0.9.14 at comp 10 and 1.1 at comp 5 then things are fine.
>         My ears are
>         > not good enough to hear the difference so for speed I would
>         target comp 5
>         > or lower.
>         >
>         > I just did a quick test and 0.9.14 at comp 10 was 3.872
>         > 1.1 at comp 5 was 5.218
>         >
>         > So if the output is comparable then we do in fact see a speed
>         improvement.
>         >
>         > Thanks for pointing this out. Is it documented? I admit I have
>         only read
>         > some of the documentation.
>         >
>         >
>         > Best Regards,
>         >
>         > Stuart Marsden
>         >
>         >
>         >
>         > On 17 December 2013 23:50, Gregory Maxwell <gmaxwell at gmail.com
>         <mailto:gmaxwell at gmail.com>> wrote:
>         >
>         >> On Mon, Dec 16, 2013 at 5:03 AM, Stuart Marsden
>         >> <stuartmarsden at finmars.co.uk
>         <mailto:stuartmarsden at finmars.co.uk>> wrote:
>         >> > I have just started trying Opus with a view to using it in
>         a project. I
>         >> am
>         >> > interested in embedded hardware and tried it on the
>         Raspberry Pi using
>         >> the
>         >> > raspbian distro.
>         >> >
>         >> > The version of libopus in the repos is 0.9.14. I installed
>         this and
>         >> tried
>         >> > encoding 2 minutes of speech from a librevox recording. It
>         managed this
>         >> at a
>         >> > respectable pace for complexity 10:
>         >>
>         >> Complexity 10 is new analysis code that didn't exist in prior
>         >> versions, setting complexity 5 gets you basically the same
>         analysis
>         >> that the 1.0 version had.
>         >>
>         >> On x86 and modern arm cores with fast FPUs the other speedups are
>         >> enough that complexity 10 is about the same speed in the old
>         software
>         >> or the new software (but with much higher and consistent VBR
>         quality).
>         >>  But on chips with slow FPUs the new analysis code is much
>         slower, in
>         >> particular because it has not been entirely converted to
>         fixed point
>         >> (e.g. in the fixed point builds) which is what I believe
>         you're seeing
>         >> here.
>         >>
>         >
>         >
>         -------------- next part --------------
>         An HTML attachment was scrubbed...
>         URL:
>         http://lists.xiph.org/pipermail/opus/attachments/20131219/6320e394/attachment.htm
>         -------------- next part --------------
>         A non-text attachment was scrubbed...
>         Name: not available
>         Type: image/png
>         Size: 41961 bytes
>         Desc: not available
>         Url :
>         http://lists.xiph.org/pipermail/opus/attachments/20131219/6320e394/attachment.png
> 
>         ------------------------------
> 
>         _______________________________________________
>         opus mailing list
>         opus at xiph.org <mailto:opus at xiph.org>
>         http://lists.xiph.org/mailman/listinfo/opus
> 
> 
>         End of opus Digest, Vol 59, Issue 21
>         ************************************
> 
>         _______________________________________________
>         opus mailing list
>         opus at xiph.org <mailto:opus at xiph.org>
>         http://lists.xiph.org/mailman/listinfo/opus
> 
> 
> 
> 
> 
> _______________________________________________
> opus mailing list
> opus at xiph.org
> http://lists.xiph.org/mailman/listinfo/opus
> 



More information about the opus mailing list