[vorbis-dev] OggSplit 0.1.0

Nathan I. Sharfi nisharfi at csupomona.edu
Wed Aug 6 18:20:06 PDT 2003



On Wed, 6 Aug 2003, Philip [ISO-8859-1] Jägenstedt wrote:

> On Wed, 6 Aug 2003 11:48:19 +0200
> Tor-Einar Jarnbjo <Tor-Einar_Jarnbjo at grosch-link.de> wrote:
>
> > Onsdag, 6 august 2003, skrev du:
> >
> > >thing. I would like to be able to use my real name somehow :) In any
> > >event, I guess UTF-8 makes some sense since the 'ä' will be something
> > >else in a non-iso8859-1 charset anyway.
> >
> > ISO8859-1 is not really commonly used anymore. The obvious reason
> > is, that it is not capable of representing the _ (euro currency sign)
> > and almost all countries using ISO8859-1 introduced the new currency
> > a few years back. Now, Windows is using its proprietary codepage,
> > in Western Europe usually 1252, and most Unix-based operating systems
> > have switched to UTF-8 as the default encoding.

Windows (NT, 2000, and XP) uses UCS-2, a Unicode encoding made out of
two-byte chunks, much like UTF-16. However, UCS-2, AIUI, doesn't extend past
U+FFFF--out of the Basic Multilingual Plane and into the astral planes.

This is not much of a practical limitation.

Older versions of Windows use Windows-1252--at least around here.

> > After ISO8859-1 has been used as a de facto standard for most text
> > files for quite a long time, we are back to a state, where it is
> > no more possible to use non-ASCII characters in a plain text file
> > and expect it to be readable on all systems :-/
>
> These are just text files, so it's not really a show-stopper if they
> aren't encoding-perfect. I unleashed a python script on some other
> source archives, and for example xine uses the iso8859-1 encoding in
> their AUTHORS file, ChangeLog and so on. I must question if the text
> editor isn't to blame if it can't open such a file properly.

I can't read the Fine Manual with the Usual Text Editor on this platform if
it has your name in it in that encoding. However, this thing does has a vi,
so...

Still, it's a hassle.

> In any case, UTF-8 also contains non-ASCII characters (all multi-byte
> characters are built up of non-7-bit-bytes), so... well I'm just going
> to leave the files as they are, since guesstimates tell me that way they
> will be readable by most people that way... or at least emacs-users ;)

If I'm reading you right, you're misinterpreting me--it's not the 8th bit
that's screwing things up--it's that TextEdit doesn't seem to recognize that
this is iso-8859-1...it's as if it's expecting UTF-8, not getting proper
UTF-8, and bailing.

Maybe it's because I have a bunch of encodings yanked out of the list, but
Latin-1's in there.

emacs is capable--much to my surprise, I might add--of handling UTF-8.

> (I actually tried setting my locale to sv_SE.UTF8, but windowmanager
> (fluxbox) was suddenly incapable of showing any text, so I guess UTF-8
> isn't as widely supported as it should be yet.)

Right. UTF-8 support on Linux is still...how to say...politely...emergent.
I've noticed good support (read: on par with Windows 2000's) in XChat 2.0 (I
suspect this is true of any gtk2 app), but GTK2 hasn't taken over the world.

Eagerly awaiting the day where I can pick up any Linux, FreeBSD or OpenBSD
distribution and have UTF-8 Just Work,
        Nathan

> // Philip Jägenstedt
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list