<html>

  <head>

    <meta content="text/html; charset=iso-8859-2"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">[OMG!!! Ponies!!!]<br>

      <br>

      Let's play with numbers!<br>

      <ul>

        <li>Wikipedia ( <a class="moz-txt-link-freetext" href="https://en.wikipedia.org/wiki/Lux">https://en.wikipedia.org/wiki/Lux</a> ) lists the

          brightest illuminance ("Direct sunlight") at 100,000 Lux; the

          darkest ("Moonless, overcast night sky (starlight)<a

            href="https://en.wikipedia.org/wiki/Lux#cite_note-radfaq-3"><span>")

            </span></a>at 0.0001 Lux; the ratio of these is 1E9</li>

        <li>log2(1E9) = 29.89, about 30 f-stops/EV/LV values;

          incidentally also 30 bits, if we wanted to represent this in

          linear light</li>

        <li>the "Barten Ramp" (ITU-R Report BT.2246) as shown in the

          Dolby paper (linked from the articles below) shows the

          "Minimum Contrast Step (%)" to be above 10% for a luminance of

          0.001 cd/m^2, with asymptotically approaching a bit below 0.4%

          for increasing luminances; in other words 0.4% seems to be the

          finest step anywhere on the curve (at least up to 10.000

          cd/m^2 - but I don't see any reason why it would get any finer

          with even higher amounts of light)<br>

        </li>

        <li>for this 0.4% precision we need log2(ln(1E9)/ln(1.004)) =

          12.34184435 bits</li>

        <li>13 bits gives us even lower, 0.2533208% precision

          (1E9^(1/2^13) = 1.002532898)</li>

        <li>but this is not a very practical unit, let's try to find

          something more practical<br>

        </li>

        <li>0.4% precision is 0.005759269 EV (about 1/173 EV)</li>

        <li>with bit shifts and table lookups in mind, let's choose the

          nearest power of two, 1/256 EV as our unit<br>

        </li>

        <li>that brings our appx. 30 EV range of looking into the Sun to

          looking at things on Earth under starlight, to 256*log2(1E9)=

          7653.722331 steps, which still nicely fits into 13 bits. yay!</li>

        <li>(adding just three more bits to arrive at 16 bits would

          bring us into whatif.xkcd.com territory - see "detonation of a

          hydrogen bomb pressed against your eyeball" at

          <a class="moz-txt-link-freetext" href="https://what-if.xkcd.com/73/">https://what-if.xkcd.com/73/</a> )<br>

        </li>

      </ul>

      Executive summary:<br>

      <ul>

        <li>eliminate the format as bottleneck</li>

        <li>declare our intensity step units to be 1/256 EV</li>

        <li>anchor values to real-world absolute intensity scale (e.g.

          numeric value 0 means 1 Lumen)</li>

        <li>13 bits should be enough for starters; but maybe even higher

          is not that far either; 16 (but even 14) bits should probably

          be more than enough for everyone - if those extra bits are

          only a linear scale in complexity, then go for it; we are

          routinely using 16 bits of audio everywhere, with way less

          than 96dB SNR rooms/headphones, because it is guaranteed to be

          good enough, and still cheap and viable with current

          technology, and unifies and simplifies things across the

          chain; perhaps it is time for the same with video<br>

        </li>

        <li>optional tone-mapping enhancement layer for contracting of

          dynamic range to current levels during decode</li>

      </ul>

      <p>Details:<br>

      </p>

      <ul>

        <li>this will allow calibrated, physically-based, HDR (for most

          intents and purposes, full dynamic range) video</li>

        <li>the format dynamic range is sufficient and is well defined,

          so the format is no longer the bottleneck due to lack of

          precision or lack of definition</li>

        <li>for capture/authoring:<br>

        </li>

        <ul>

          <li>capture in real life, even on something like a Sony F65

            (14 stops of latitude -

<a class="moz-txt-link-freetext" href="http://pro.sony.com/bbsc/ssr/show-highend/resource.solutions.bbsccms-assets-show-highend-F65.shtml#/f65t1_10">http://pro.sony.com/bbsc/ssr/show-highend/resource.solutions.bbsccms-assets-show-highend-F65.shtml#/f65t1_10</a>)</li>

          <li>or for CGI/synthetic footage for composition</li>

          <li>with many current camera offerings the format is the limit

            - look at proliferation of HDR gammas everywhere as stopgap,

            often shoehorned into or simply outright labeled as 709, for

            lack of anything better, needing secret handshakes to

            reverse the transform at the other end; you can recrord into

            RAW, but I have a hunch many productions only do that to

            preserve the exposure latitude available to grade for final

            look in post, and they'd welcome the option to reduce the

            amount of data<br>

          </li>

          <li>this would eliminate that bottleneck<br>

          </li>

        </ul>

        <li>encourage full dynamic range HDR authoring workflow (even

          outside the context of the codec); be the first, catalyze the

          entire industry to follow, build momentum; break the vicious

          cycle of no innovation due to lack of demand due to lack of

          material due to lack of format due to lack of innovation<br>

        </li>

        <li>for display/presentation/consumption:</li>

        <ul>

          <li>same issue as on capture side: format is the bottleneck:

            displays already capable of higher intensities and wider

            dynamic range than current formats offer - but look at how

            sluggish uptake for deep color and x.v.Color is, even though

            these are only evolutionary steps, not revolutionary;

            benefit is not visible/large enough for customers to adopt,

            lack of customers stalls innovation<br>

          </li>

          <li>mutiple approaches possible (considering how displays will

            become both brigher and higher dynamic range)<br>

          </li>

          <ul>

            <li>author bakes tone mapping (dynamic range reduction) into

              material, to match current dynamic range. present-proof,

              but future brighter displays won't benefit, plus less

              incentive to innovate due to lack of customer demand due

              to lack of content utilizing higher brightness/dynamic

              range<br>

            </li>

            <li>author sends full HDR through, display does full dynamic

              range reduction calibrated to its capabilites (max dynamic

              range, bit depth, dithering, etc.) - future-proof in

              theory, but quality/faithfulness depends on device

              implementor, so final looks loses (some of) atristic

              control; although provides room for innovation for display

              manufacturers<br>

            </li>

            <li>middle-ground: baseline dynamic range standardized (at

              whatever is reasonably current), author sends full dynamic

              range HDR through, plus tone mapping enhancement layer to

              reduce dynamic range of material to standard display

              dynamic range during decode/display. display/user

              cooperatively adjust strength of contrast reduction to

              taste (e.g. at a minimum on/off, or a gradual scale of

              full to none). complete artistic control retained for

              current (and future displays) at current dynamic range

              levels, but allows graceful enhancement as better displays

              come along</li>

            <li>maybe incorporate some sort of explicit room light level

              modeling (many current TVs already have ambient light

              level sensors - let's utilize it in a meaningful way) to

              at least semi-automate the matching or perceptual matching<br>

            </li>

            <ul>

              <li>maybe offer black gamma, knee adjustment, etc. usual -

                currently in-camera - or similar tone mapping operations

                for savvy users<br>

              </li>

            </ul>

          </ul>

        </ul>

      </ul>

      2015.08.16. 12:17 keltez�ssel, <a class="moz-txt-link-abbreviated" href="mailto:HuBandiT@gmail.com">HuBandiT@gmail.com</a> �rta:<br>

    </div>

    <blockquote cite="mid:55D06342.7050207@gmail.com" type="cite">

      <meta http-equiv="content-type" content="text/html;

        charset=iso-8859-2">

      HDR and higher bit-depth seem to be coming:<br>

      <br>

      <a moz-do-not-send="true" class="moz-txt-link-freetext"

href="http://www.dvinfo.net/article/misc/science_n_technology/hpa-tech-retreat-2014-day-4.html">http://www.dvinfo.net/article/misc/science_n_technology/hpa-tech-retreat-2014-day-4.html</a>

      section "Better Pixels: Best Bang for the Buck?"<br>

      <br>

      <a moz-do-not-send="true" class="moz-txt-link-freetext"

href="http://www.dvinfo.net/article/misc/science_n_technology/hpa-tech-retreat-2014-day-1-poynton-watkinson.html">http://www.dvinfo.net/article/misc/science_n_technology/hpa-tech-retreat-2014-day-1-poynton-watkinson.html</a><br>

      <ul>

        <li>industry seems to use 12-14 bits today, consensus seems to

          be at least 12 bits of luma is needed soon even for consumers;

          prosumer camcorders (e.g. Sony PXW-X70 - $2000) are doing

          10-bit 4:2:2 1080p59.94 today, and anything above $2500-3000

          seems to be 12 bit or above<br>

        </li>

        <li>looks like 13 bits would be sufficient with a simple log

          curve, Dolby is proposing 12 bits with their "Perceptual

          Quantization" curve</li>

        <li>some armchair thinking (just my pebbles to throw into the

          thinking pool):</li>

        <ul>

          <li>log encoding would have the benefit that the traditional

            color difference signals derived from log encoded RGB

            components would eliminate (white-balanced) intensity

            changes (e.g. shadows, fades/dips to-from black) from color

            channels</li>

          <li>with intensity decoupled from color:</li>

          <ul>

            <li>considerably lower color precision could be sufficient

              (since cosine falloff from object curvature, lens

              vignetting, primary light shadows no longer leak into

              chroma, no longer forcing it to have comparable precision

              to eliminate banding on final result)<br>

            </li>

          </ul>

          <ul>

            <li>maybe replace color differences with more preceptual

              metrics: some kind of saturation and hue</li>

            <ul>

              <li>could allow heavier quantization still or even lower

                color precision outright (on the assumption that hue and

                saturation changes much less in well-lit, well-exposed,

                real life scenes)<br>

              </li>

              <li>think of it like one aspect of reverse Phong shading:

                shiny sphere in vacuum under white light only ever has

                it's own hue - only intensity and saturation changes

                (cosine falloff: towards black, highlight: towards

                white; hue channel is quasi constant, flat; real world

                will be messier e.g. hue will be pulled away by light

                reflected from surrounding objects - but see below on

                illumination/object color decomposition)<br>

              </li>

            </ul>

          </ul>

          <li>once chroma/color precision is lowered, it might make

            sense to go 4:4:4 all the time and just don't bother with

            chroma down/upsampling at all</li>

          <li>establish the scene/discussion for scene decomposition:

            e.g. separately coding albedo (object reflectance) and

            illuminance</li>

          <ul>

            <li>the first step could be a separate

              illuminance/intensity/gain channel, that factors (multiply

              in linear light = addition in log light) into the final

              intensity of the output pixels</li>

            <li>encoders unwilling to utilize this can leave this

              channel blank at 0dB/0EV/1x<br>

            </li>

            <li>simplistic encoders could benefit:</li>

            <ul>

              <li>dips to/from black could preserve full color in main

                channels, and only adjust this channel</li>

              <li>crossfades could ramp up/down this channel while

                referencing main channels at the two frames at both ends

                of the crossfade (weighed prediction in linear light

                conceptually)<br>

              </li>

            </ul>

            <li>advanced encoders: separately encoding high amplitude

              scene illuminance variations from lower ampltude object

              reflectance/texture might provide coding gains, especially

              in the case of HDR<br>

            </li>

            <ul>

              <li>scene illuminance: higher amplitude, but less details

                (mostly "broad strokes" - different statistics than main

                channels)<br>

              </li>

              <li>object reflectance/texture (main channels): smaller

                amplitude, but more details</li>

              <li>separate prediction/motion compensation for these two<br>

              </li>

              <li>ideally, scene illuminance should be color as well, to

                predict coloured lighting (e.g. illuminance in single

                off-white or multiple lightsource cases)</li>

              <li>use it as hints for HDR tonemapping tool (see still

                photography research)<br>

              </li>

            </ul>

            <li>next step could be to add a highlight layer (kinda like

              specular highlights in reverse Phong shading - gradually

              blending away the area around the highlight position into

              some color of the light source, whether it's transform

              coding or some kind of shape based parametric modelling),

              there exists machine color vision research in these

              directions<br>

            </li>

            <li>doesn't need to be perfect (it's just prediction after

              all) or even cover many cases - just go for some

              low-hanging fruits, enough to spark industry

              discussion/experimentation<br>

            </li>

          </ul>

        </ul>

      </ul>

      <br>

      <br>

      <br>

    </blockquote>

    <br>

  </body>

</html>