[ogg-dev] OggPCM: support for little-endianness only?
jean-marc.valin at usherbrooke.ca
Wed Feb 13 02:02:10 PST 2008
>> Really? How low does it take to swap bytes vs. the time it takes to read
>> from disk?
> In CPU cycles? The difference is enormous.
> If no byte swapping is required the vast majority of the work is done by
> the DMA engine, from disk straight into user space memory and while the
> read is underway, the CPU can do other work.
> For the byte swapping case, every sample needs to be loaded into the cache
> and CPU, byte swapped and then written back to memory.
Come on. Given a disk bandwidth of 40 MB/s and a memory bandwidth
(that's the bottleneck here) of 4 GB/s, you can saturate the disk with
just one percent CPU. Not to mention the fact that if you're doing
anything useful with the data, you're already paying for the memory
bandwidth. I'm pretty sure the overhead of the context switches and
libogg both dwarf the amount of CPU it takes to swap bytes (you can swap
4 16-bit samples per cycle with SSE).
I'd be quite interested if you could provide a benchmark that really
shows by swapping to make a difference.
More information about the ogg-dev