<div dir="ltr"><div>Thanks for the hints everyone.</div><div><br></div><div>It looks like the metadata interface has everything I need, however I am having trouble with the encoder itself.<br></div><div><br></div><div>We have found that the reference encoder can't handle more than 2^36 samples properly. It encodes the data and calculates md5sum correctly but truncates the total samples, and then the decoder thinks there is a signature mismatch because it only checks the truncated number of samples. I have reported this here: <a href="https://github.com/xiph/flac/issues/237">https://github.com/xiph/flac/issues/237</a></div><div><br></div><div>Building the seektable is also a problem since it goes at the beginning of the file and we won't know how big to make it when recording from a live source. We will need up to 20MB of seek table for the longest recordings, but this means every recording will be at least 20MB regardless of length. Adding new seekpoints after encoding is also extremely slow as metaflac seems to decode the entire file rather than using the existing seekpoints.<br></div><div><br></div><div>We will also need parallel encoding to reach the target of encoding up to 80MB/s of data. I can't see any way of doing this with the public API since the stream encoder callback delivers arbitrary buffers rather than frames, meaning the results of two encoders can't easily be combined.</div><div><br></div><div>I know I could avoid all these problems by splitting the recording at arbitrary points but having to manage a lot of small files is something I specifically want to avoid.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 1 Apr 2021 at 16:58, Roy Galaasen <<a href="mailto:roy.galaasen@gmail.com" target="_blank">roy.galaasen@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto"><div dir="ltr">I was going to answer this question yesterday, so I was refreshing my mind a bit by doing some research, but life has a habit of getting in the way.</div><div dir="ltr"><br></div><div dir="ltr">My response would pretty much be like the one that was given by Martijn van Beurden, but differ a bit regarding the sample rate:</div><div dir="ltr"><br></div><div dir="ltr">There are two places where sample rate is given: </div><div dir="ltr"><br></div><div dir="ltr">In the STREAMINFO block and in each frame header themselves.</div><div dir="ltr"><br></div><div dir="ltr">It says in the documentation that the STREAMINFO sample rate is a 20 bit value, but it is limited by the structure of the frame headers to 655350 Hz. As opposed to the seemingly possible 2^20 Hz.</div><div dir="ltr"><br></div><div dir="ltr">If you look at the frame header there is a four bit field combination to select from a set number of fixed sample rates. If most significant bit is set, you can specify a more flexible value in terms of 8 bit sample rate in terms of kHz, or 16 bits value in terms of Hz or tens of Hz, and lastly to use the value that is given in the STREAMINFO header.</div><div dir="ltr"><br></div><div dir="ltr">That won’t of course work if you start reading in the middle of the stream, but as an archival format, this value will always be available, should it not somehow get corrupted.</div><div dir="ltr"><br></div><div dir="ltr">I am curious, though about the limitation mentioned, since it seems to me that you could put any value (except 0) into the STREAMINFO header and then just put %0000 into the sample rate field of each frame header to reference the STREAMINFO value instead, even values that are higher than the mentioned maximum of 655350 Hz. </div><div dir="ltr"><br></div><div dir="ltr">That makes me wonder why the documentation mentions a limitation, when it in theory should work with values up to 1048576 -1. Is there something that is not coming out from the documentation?</div><div dir="ltr"><br></div><div dir="ltr">There is something called “subset”, which is a set of criteria which is set to ensure that the file is streamable. You can’s specify to use the streaminfo sample rate if you want to confirm to the subset. Maybe that is what the documentation is referring to? Your need is already stated as being “archival”, hence you do not need to conform to the subset.</div><div dir="ltr"><br></div><div dir="ltr">Regards,</div><div dir="ltr">  Roy Galaasen.</div><div dir="ltr"><br></div><div dir="ltr">Disclaimer:</div><div dir="ltr">I am in no way affiliated with this project, just an enthusiast that has written a complete decoder some time in the past.</div><div dir="ltr"><br></div><div dir="ltr"><br></div><div dir="ltr"><br><blockquote type="cite">On 1 Apr 2021, at 14:06, Martijn van Beurden <<a href="mailto:mvanb1@gmail.com" target="_blank">mvanb1@gmail.com</a>> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><div dir="ltr"><div>Hi,</div><div><br></div><div>Considering the sample rate, I'm pretty sure you'll have to resort to metadata to store that information. The FLAC format can't be extended to store such high sample rates in the frame headers. However, the normal vorbis tags accept any key=value pair you want to use, no restrictions.</div><div><br></div><div>For inserting cue marks, using a cuesheet is a rather standard way to do this for audio, however, as these use CDDA frames for location, this might not be the best way. Fortunately, you can define your own metadata format as an 'application' metadata block. See <a href="https://xiph.org/flac/format.html" target="_blank">https://xiph.org/flac/format.html</a> for more information, see METADATA_BLOCK_APPLICATION. I don't think using an OGG container will add anything not already possible with FLAC, except when using multiple streams, which I don't think is the case here.<br></div><div><br></div><div>For not rewriting a whole file, one can add a padding metadata block at capture, and use this padding to write metadata afterwards. <br></div><div><br></div><div>Considering Python, there are quite a few options actually. I've been using <a href="https://pypi.org/project/SoundFile/" target="_blank">https://pypi.org/project/SoundFile/</a>, but I think it is a good idea to try a few and see which one suits your needs best. I only used it for reading FLAC files, so I can't really comment there.</div><div><br></div><div>Kind regards, <br></div><div><br></div><div>Martijn van Beurden<br></div><br></div>

<span>_______________________________________________</span><br><span>flac-dev mailing list</span><br><span><a href="mailto:flac-dev@xiph.org" target="_blank">flac-dev@xiph.org</a></span><br><span><a href="http://lists.xiph.org/mailman/listinfo/flac-dev" target="_blank">http://lists.xiph.org/mailman/listinfo/flac-dev</a></span><br></div></blockquote></div></blockquote></div><br clear="all"><br>-- <br><div dir="ltr">Alistair Buxton<br><a href="mailto:a.j.buxton@gmail.com" target="_blank">a.j.buxton@gmail.com</a></div>