[flac-dev] flac, UTF-8 and Windows

lvqcl lvqcl.mail at gmail.com
Sat Jan 9 08:20:40 PST 2016

That's how I understand how flac.exe works with unicode under Windows:

There's a flag win_utf8_io_codepage that is equal either to CP_ACP or to CP_UTF8.
Initially it's equal to CP_ACP.

Then flac.exe/metaflac.exe call get_utf8_argv() that do some things and sets
win_utf8_io_codepage to CP_UTF8 on success. This is the only way to set this
flag to CP_UTF8. The programs continue to work only if get_utf8_argv() succeeds,
so we know that win_utf8_io_codepage is always set to CP_UTF8.

Actually, there's a code in metaflac/operations_shorthand_vorbiscomment.c and
flac/vorbiscomment.c that won't work if get_utf8_argv() failed:
   #ifdef _WIN32 /* everything in UTF-8 already. Must not alter */

So, flac.exe calls get_utf8_argv(), then calls encode_file() which uses
flac_fopen() itself, and then indirectly calls FLAC__stream_encoder_init_{ogg_}file()
 from libFLAC.

One of the parameters of this function is const char *filename.
Since filename is an UTF-8 encoded string, this libFLAC function must
call file functions from share/win_utf8_io/win_utf8_io.c.


TL;DR: some functions in libFLAC/stream_decoder.c, libFLAC/stream_encoder.c
and libFLAC/metadata_iterators.c depend on win_utf8_io.c, and I cannot see
an easy way to remove this dependency.

On the other hand, functions from win_utf8_io.h is not a part of the
FLAC public API. So all 3rd-party programs that use libFLAC cannot enable
UTF-8 support and send UTF8-encoded filenames to libFLAC functions;
all filenames must be in a default codepage instead. And they (directly or
indirectly) link with win_utf8_io despite the fact that they cannot use it.
(Actually the use it, but fopen_utf8() behaves for them as plain fopen(). It just
unnnecessarily calls malloc/free and MultiByteToWideChar. Correct me if I'm wrong).

More information about the flac-dev mailing list