<div dir="ltr"><div><font face="arial, helvetica, sans-serif">Thank Jean-Marc!</font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><font face="arial, helvetica, sans-serif">The speedup percentages are all relative to the entire encoder.</font><div><font face="arial, helvetica, sans-serif"><br></font></div><div><span style="color:rgb(53,53,53);white-space:pre"><font face="arial, helvetica, sans-serif">Comparing to master, this optimization patch speeds up fixed-point SILK encoder on NEON as following:


Complexity  5: 6.1%


Complexity  6: 5.8%


Complexity  8: 5.5%


Complexity 10: 4.0%</font></span></div><div><font color="#353535" face="arial, helvetica, sans-serif"><span style="white-space:pre"><br></span></font></div><div><font color="#353535" face="arial, helvetica, sans-serif"><span style="white-space:pre">when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max MHz: 2116.5</span></font></div><div><font color="#353535" face="arial, helvetica, sans-serif"><span style="white-space:pre"><br></span></font></div><div><font color="#353535" face="arial, helvetica, sans-serif"><span style="white-space:pre">Thanks,</span></font></div><div><font color="#353535" face="arial, helvetica, sans-serif"><span style="white-space:pre">Linfeng<br></span></font><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Apr 5, 2017 at 11:02 AM, Jean-Marc Valin <span dir="ltr"><<a href="mailto:jmvalin@jmvalin.ca" target="_blank">jmvalin@jmvalin.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Linfeng,<br>


<br>


Thanks for the updated patch. I'll have a look and get back to you. When<br>


you report speedup percentages, is that relative to the entire encoder<br>


or relative to just that function in C? Also, what's the speedup<br>


compared to master?<br>


<br>


Cheers,<br>


<br>


        Jean-Marc<br>


<span class="gmail-"><br>


On 05/04/17 12:14 PM, Linfeng Zhang wrote:<br>


> I attached a new patch with small cleanup (disassembly is identical as<br>


> the last patch). We have done the same internal testing as usual.<br>


><br>


> Also, attached 2 failed temporary versions which try to reduce code size<br>


> (just for code review reference purpose).<br>


><br>


> The new patch of silk_warped_autocorrelation_<wbr>FIX_neon() has a code size<br>


> of 3,228 bytes (with gcc).<br>


> smaller_slower.c has a code size of 2,304 bytes, but the encoder is<br>


> about 1.8% - 2.7% slower.<br>


> smallest_slowest.c has a code size of 1,656 bytes, but the encoder is<br>


> about 2.3% - 3.6% slower.<br>


><br>


> Thanks,<br>


> Linfeng<br>


><br>


> On Mon, Apr 3, 2017 at 3:01 PM, Linfeng Zhang <<a href="mailto:linfengz@google.com">linfengz@google.com</a><br>


</span><span class="gmail-">> <mailto:<a href="mailto:linfengz@google.com">linfengz@google.com</a>>> wrote:<br>


><br>


>     Hi Jean-Marc,<br>


><br>


>     Attached is the silk_warped_autocorrelation_<wbr>FIX_neon() which<br>


>     implements your idea.<br>


><br>


>     Speed improvement vs the previous optimization:<br>


><br>


>     Complexity 0-4: Doesn't call this function. Complexity 5: 2.1%<br>


>     (order = 16) Complexity 6: 1.0% (order = 20) Complexity 8: 0.1%<br>


>     (order = 24) Complexity 10: 0.1% (order = 24)<br>


><br>


>     Code size of silk_warped_autocorrelation_<wbr>FIX_neon() changes from<br>


>     2,644 bytes to 3,228 bytes.<br>


><br>


>     The reason of larger code size is that the new optimization<br>


>     specializes order 16, 20 and 24. If only keeping order 24<br>


>     specialization, the code still works and the code size is smaller,<br>


>     but the encoder speed will drop 4.0% for Complexity 5 and 2.0% for<br>


>     Complexity 6. Anyway, the new code is easier to understand and maintain.<br>


><br>


>     Thanks,<br>


><br>


>     Linfeng<br>


><br>


><br>


>     On Tue, Feb 7, 2017 at 8:58 AM, Linfeng Zhang <<a href="mailto:linfengz@google.com">linfengz@google.com</a><br>


</span><span class="gmail-">>     <mailto:<a href="mailto:linfengz@google.com">linfengz@google.com</a>>> wrote:<br>


><br>


>         Hi Jean-Marc,<br>


><br>


>         Thanks for your suggestions. Will get back to you once we have<br>


>         some updates.<br>


><br>


>         Linfeng<br>


><br>


>         On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin<br>


</span><div><div class="gmail-h5">>         <<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a> <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>>> wrote:<br>


><br>


>             Hi Linfeng,<br>


><br>


>             On 06/02/17 07:18 PM, Linfeng Zhang wrote:<br>


>             > This is a great idea. But the order (psEncC->shapingLPCOrder) can be<br>


>             > configured to 12, 14, 16, 20 and 24 according to complexity parameter.<br>


>             ><br>


>             > It's hard to get a universal function to handle all these orders<br>


>             > efficiently. Any suggestions?<br>


><br>


>             I can think of two ways of handling larger orders. The<br>


>             obvious one is<br>


>             simply to add an inner loop of the form:<br>


>             for (i=0;i<order;i+=VECTOR_SIZE)<br>


>             I think what may be more efficient is to simply have a small<br>


>             "order-N"<br>


>             (N=4 or 8) kernel that not only computes the correlation of<br>


>             order N, but<br>


>             then spits out the signal after the N-stage all-pass is<br>


>             applied. The<br>


>             kernel would look like:<br>


><br>


>             void autocorr_kernel4(int *corr, int *orig, int *input, int<br>


>             *output, int<br>


>             len) {<br>


>                /* Implement vectorized order-4 filter (could also be<br>


>             order 8)<br>


>                   as described in previous email and outputs the<br>


>             filtered signal.<br>


>                */<br>


>             }<br>


><br>


>             and then the full function would run the kernel multiple<br>


>             times and look<br>


>             like:<br>


><br>


>             void full_autocorr(int *corr, int *orig, int len, int order) {<br>


>                int i;<br>


>                int tmp[MAX_SIZE];<br>


>                int *in = orig;<br>


>                for (i=0;i<order;i+=4) {<br>


>                   autocorr_kernel4(corr+i, orig, in, tmp, len);<br>


>                   /* Make subsequent calls use the filtered signal as<br>


>             input. */<br>


>                   in = tmp;<br>


>                }<br>


>             }<br>


><br>


>             I think the should not only reduce/eliminate the<br>


>             prologue/epilogue<br>


>             problem, but it should also be more efficient since almost<br>


>             all vectors<br>


>             processed would use the full size.<br>


><br>


>             Maybe a third option (not sure it's a good idea, but still<br>


>             mentioning<br>


>             it) would be to have a function that hardcodes order=24 and<br>


>             discards the<br>


>             larger values that aren't needed. Since the smallest order<br>


>             seems to be<br>


>             16, it wouldn't be much of a waste and the code might end up<br>


>             running<br>


>             faster for the higher orders.<br>


><br>


>             Cheers,<br>


><br>


>                     Jean-Marc<br>


><br>


><br>


>             > Thanks,<br>


>             > Linfeng<br>


>             ><br>


>             > On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a> <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>><br>


</div></div><div><div class="gmail-h5">>             > <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a> <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>>>><br>


>             wrote:<br>


>             ><br>


>             >     Hi Linfeng,<br>


>             ><br>


>             >     On 06/02/17 02:51 PM, Linfeng Zhang wrote:<br>


>             >     > However, the critical thing is that all the states<br>


>             in each stage when<br>


>             >     > processing input[i] are reused by the next<br>


>             input[i+1]. That is<br>


>             >     > input[i+1] must wait input[i] for 1 stage, and<br>


>             input[i+2] must wait<br>


>             >     > input[i+1] for 1 stage, etc.<br>


>             ><br>


>             >     That is indeed the tricky part... and the one I think<br>


>             you could do<br>


>             >     slightly differently. If you approach the problem in<br>


>             terms of computing<br>


>             >     chunks of the inputs N samples at a time, then indeed<br>


>             the approach you<br>


>             >     are describing is the only solution. What I was<br>


>             proposing though is to<br>


>             >     instead chop the "order" in chunks of N. Using your<br>


>             notation, you would<br>


>             >     be doing:<br>


>             ><br>


>             >     PROC(<br>


>                   in0(s0))<br>


>             >     PROC(<br>


>             in0(s1) in1(s0))<br>


>             >     PROC(                                        in0(s2)<br>


>             in1(s1) in2(s0))<br>


>             >     PROC(                                in0(s3) in1(s2)<br>


>             in2(s1) in3(s0))<br>


>             >     PROC(                        in0(s4) in1(s3) in2(s2)<br>


>             in3(s1) in4(s0))<br>


>             >     PROC(                in0(s5) in1(s4) in2(s3) in3(s2)<br>


>             in4(s1) in5(s0))<br>


>             >     PROC(        in0(s6) in1(s5) in2(s4) in3(s3) in4(s2)<br>


>             in5(s1) in6(s0))<br>


>             >     PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3) in5(s2)<br>


>             in6(s1) in7(s0))<br>


>             >     PROC(in1(s7) in2(s6) in3(s5) in4(s4) in5(s3) in6(s2)<br>


>             in7(s1) in8(s0))<br>


>             >     PROC(in2(s7) in3(s6) in4(s5) in5(s4) in6(s3) in7(s2)<br>


>             in8(s1) in9(s0))<br>


>             >     PROC(in3(s7) in4(s6) in5(s5) in6(s4) in7(s3) in8(s2)<br>


>             in9(s1)in10(s0))<br>


>             >     PROC(in4(s7) in5(s6) in6(s5) in7(s4) in8(s3)<br>


>             in9(s2)in10(s1)in11(s0))<br>


>             >     ...and so on until the end of the input vector<br>


>             ><br>


>             >     The difference is that it's now the input vector that<br>


>             "slides" and the<br>


>             >     "state" values sy that remain in the same place.<br>


>             There's still a<br>


>             >     prologue, but you can easily get rid of it by<br>


>             (implicitly) zero-padding<br>


>             >     the in vector during the initialization phase (start<br>


>             with a zero vector<br>


>             >     and real one value at a time). Getting rid of the<br>


>             epilogue is a little<br>


>             >     trickier, but I think it can be done.<br>


>             ><br>


>             >     Cheers,<br>


>             ><br>


>             >             Jean-Marc<br>


>             ><br>


>             >     > Then it becomes this<br>


>             >     ><br>


>             >     > FOR in=0 to N WITH in+=8<br>


>             >     >   PROC(in0(s0)) /* prolog 0 */<br>


>             >     >   PROC(in0(s1) in1(s0)) /* prolog 1 */<br>


>             >     >   PROC(in0(s2) in1(s1) in2(s0)) /* prolog 2 */<br>


>             >     >   PROC(in0(s3) in1(s2) in2(s1) in3(s0)) /* prolog 3 */<br>


>             >     >   PROC(in0(s4) in1(s3) in2(s2) in3(s1) in4(s0)) /*<br>


>             prolog 4 */<br>


>             >     >   PROC(in0(s5) in1(s4) in2(s3) in3(s2) in4(s1)<br>


>             in5(s0)) /* prolog 5 */<br>


>             >     >   PROC(in0(s6) in1(s5) in2(s4) in3(s3) in4(s2)<br>


>             in5(s1) in6(s0)) /*<br>


>             >     > prolog 6 */<br>


>             >     >   PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3)<br>


>             in5(s2) in6(s1)<br>


>             >     in7(s0))<br>


>             >     > /* fully process 8 inputs */<br>


>             >     >   PROC(in0(s8) in1(s7) in2(s6) in3(s5) in4(s4)<br>


>             in5(s3) in6(s2)<br>


>             >     in7(s1))<br>


>             >     > /* continue */<br>


>             >     >   PROC(in0(s9) in1(s8) in2(s7) in3(s6) in4(s5)<br>


>             in5(s4) in6(s3)<br>


>             >     in7(s2))<br>


>             >     > /* continue */<br>


>             >     >   PROC(in0(s10) in1(s9) in2(s8) in3(s7) in4(s6)<br>


>             in5(s5) in6(s4)<br>


>             >     in7(s3))<br>


>             >     > /* continue */<br>


>             >     >   PROC(in1(s10) in2(s9) in3(s8) in4(s7) in5(s6)<br>


>             in6(s5) in7(s4)) /*<br>


>             >     > epilog 0 */<br>


>             >     >   PROC(in2(s10) in3(s9) in4(s8) in5(s7) in6(s6)<br>


>             in7(s5)) /* epilog<br>


>             >     1 */<br>


>             >     >   PROC(in3(s10) in4(s9) in5(s8) in6(s7) in7(s6)) /*<br>


>             epilog 2 */<br>


>             >     >   PROC(in4(s10) in5(s9) in6(s8) in7(s7)) /* epilog 3 */<br>


>             >     >   PROC(in5(s10) in6(s9) in7(s8)) /* epilog 4 */<br>


>             >     >   PROC(in6(s10) in7(s9)) /* epilog 5 */<br>


>             >     >   PROC(in7(s10)) /* epilog 6 */<br>


>             >     > END FOR<br>


>             >     ><br>


>             >     > And<br>


>             >     >   PROC(in0(s7) in1(s6) in2(s5) in3(s4) in4(s3)<br>


>             in5(s2) in6(s1)<br>


>             >     in7(s0))<br>


>             >     > /* fully process 8 inputs */<br>


>             >     >   PROC(in0(s8) in1(s7) in2(s6) in3(s5) in4(s4)<br>


>             in5(s3) in6(s2)<br>


>             >     in7(s1))<br>


>             >     > /* continue */<br>


>             >     >   PROC(in0(s9) in1(s8) in2(s7) in3(s6) in4(s5)<br>


>             in5(s4) in6(s3)<br>


>             >     in7(s2))<br>


>             >     > /* continue */<br>


>             >     > is actually the expansion of the kernel loop<br>


>             >     > FOR i=0 TO order-6 WITH i++<br>


>             >     >   PROC(in0(si+7) in1(si+6) in2(si+5) in3(si+4)<br>


>             in4(si+3) in5(si+2)<br>


>             >     > in6(si+1) in7(si+0))<br>


>             >     > END FOR<br>


>             >     ><br>


>             >     > The worst thing is that corr_QC[] is so sensitive<br>


>             that any extra<br>


>             >     > processing will make them wrong and propagate to the<br>


>             next loop (next 8<br>


>             >     > inputs). state_QS[] is a little better but still<br>


>             very sensitive. For<br>


>             >     > instance, if adding PROC(in0(s11') in1(s10) in2(s9)<br>


>             in3(s8) in4(s7)<br>


>             >     > in5(s6) in6(s5) in7(s4)) to the kernel loop (by<br>


>             looping one more time)<br>


>             >     > and remove epilog 0, then all final results will be<br>


>             wrong.<br>


>             >     ><br>


>             >     > That's why the prolog and epilog cannot be saved to<br>


>             the best of my<br>


>             >     > knowledge.<br>


>             >     ><br>


>             >     > The assembly size of<br>


>             silk_warped_autocorrelation_<wbr>FIX_neon() is about<br>


>             >     > 2,744 bytes. Compared with the C code size (about<br>


>             452 bytes), it's 2.3<br>


>             >     > KB larger. Considering<br>


>             silk_warped_autocorrelation_<wbr>FIX_c() is the<br>


>             >     second<br>


>             >     > place CPU heavy function in fixed-point, and our<br>


>             testing shows up<br>


>             >     to 7%<br>


>             >     > CPU run time saving of the total encoder with this<br>


>             optimization (at<br>


>             >     > Complexity 8), maybe we can take the I-cache burden<br>


>             even if finally we<br>


>             >     > still cannot remove the big chunk of prolog and epilog.<br>


>             >     ><br>


>             >     > Thanks,<br>


>             >     > Linfeng Zhang<br>


>             >     ><br>


>             >     > On Sat, Feb 4, 2017 at 4:17 PM, Jean-Marc Valin<br>


>             >     <<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a> <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>><br>


>             <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a> <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>>><br>


>             >     > <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a><br>


>             <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>> <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a><br>


>             <mailto:<a href="mailto:jmvalin@jmvalin.ca">jmvalin@jmvalin.ca</a>>>>> wrote:<br>


>             >     ><br>


>             >     >     Hi Felicia,<br>


>             >     ><br>


>             >     >     I've had time to work through the math in the<br>


>             original<br>


>             >     function and I'm<br>


>             >     >     pretty sure it's possible to vectorize this<br>


>             without the huge<br>


>             >     >     prologue/epilogue.<br>


>             >     ><br>


>             >     >     For the simple case where order = vector size =<br>


>             N (but it<br>


>             >     should easily<br>


>             >     >     generalize to larger order), what I came up with is:<br>


>             >     ><br>


>             >     >     initialize X, Y, M, C to vector of zeros<br>


>             >     ><br>


>             >     >     for i=0 to N+order<br>


>             >     >        T = [x(i), Y(0:N-2)]<br>


>             >     >        Y = M + coeff * (Y - T)<br>


>             >     >        M = T<br>


>             >     >        X = [x(i), X(0:N-1)]<br>


>             >     >        C = C + X*Y<br>


>             >     ><br>


>             >     >     I think something similar to this (assuming I<br>


>             didn't mess up any<br>


>             >     >     details) should give you the correlations in<br>


>             vector C. Did I miss<br>


>             >     >     anything?<br>


>             >     ><br>


>             >     >     Cheers,<br>


>             >     ><br>


>             >     >             Jean-Marc<br>


>             >     ><br>


>             >     ><br>


>             >     >     On 31/01/17 12:30 PM, Felicia Lim wrote:<br>


>             >     >     > Hi,<br>


>             >     >     ><br>


>             >     >     > Attached is a patch with arm neon<br>


>             optimizations for<br>


>             >     >     > silk_warped_autocorrelation_<wbr>FIX(). Please review.<br>


>             >     >     ><br>


>             >     >     > Thanks,<br>


>             >     >     > Felicia<br>


>             >     >     ><br>


>             >     >     ><br>


>             >     >     > ______________________________<wbr>_________________<br>


>             >     >     > opus mailing list<br>


>             >     >     > <a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>><br>


>             <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>>><br>


>             <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>><br>


</div></div>>             >     <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>>>><br>


<span class="gmail-">>             >     >     > <a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/mailman/<wbr>listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>><br>


>             >     <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>>><br>


>             >     >     <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>><br>


>             >     <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>>>><br>


>             >     >     ><br>


>             >     >     ______________________________<wbr>_________________<br>


>             >     >     opus mailing list<br>


>             >     >     <a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>><br>


>             <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>>><br>


>             <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>><br>


</span>>             >     <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a> <mailto:<a href="mailto:opus@xiph.org">opus@xiph.org</a>>>><br>


<div class="gmail-HOEnZb"><div class="gmail-h5">>             >     >     <a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/mailman/<wbr>listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>><br>


>             >     <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>>><br>


>             >     >     <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>><br>


>             >     <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a><br>


>             <<a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/<wbr>mailman/listinfo/opus</a>>>><br>


>             >     ><br>


>             >     ><br>


>             ><br>


>             ><br>


><br>


><br>


><br>


><br>


</div></div></blockquote></div><br></div></div></div>