Re: commit: abi: UTF8String class

From: Karl Ove Hufthammer (huftis@bigfoot.com)
Date: Sun Apr 21 2002 - 11:19:23 EDT

Next message: Karl Ove Hufthammer: "Re: commit: abi: UTF8String class"

Previous message: Rui Miguel Silva Seabra: "abiword dtd"
In reply to: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Next in thread: Scott Rushfeldt: "Re: commit: abi: UTF8String class"
Next in thread: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Next in thread: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Reply: Scott Rushfeldt: "Re: commit: abi: UTF8String class"
Reply: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Andrew Dunbar <hippietrail@yahoo.com> wrote in
news:20020421150105.97083.qmail@web9608.mail.yahoo.com:

> *May* map to a different glyph - but glyph is not the
> correct term, I believe. You could have a c with an
> acute accent and a cedilla, for instance, which would
> need three codepoints but appear on the screen to be
> one character. I don't have the proper definition for
> glyph handy sorry.

Neither do I, but I can try: A glyph is a graphical presentation
form. I Unicode, there is neither not a one to one mapping from
characters to glyphs, or the other way. One character can displayed
as several glyphs and one glyph can be displayed as several
characters. E.g. the greek letter pi and the mathematical symbol
(usually) use the same glyph (graphical presentation), but they're
different character. Sometimes a character is displayed in
different ways depending on which language it is used in (e.g.
Japanese vs. Chinese).

But we also have combining characters in Unicode. For example, to
write a é, you write and e, followed by a combining ´. This may be
rendered as an e with ´ superimposed (usually looks bad), but
usually a separate é glyph is used. Note that both, é, e and the
combining ´ characters are defined in Unicode. This is mainly for
backwards compatibility with older character sets (e.g. ISO-8859-
1). Future characters will likely not feature any new pre-composed
characters.

Lastly, none of this has anything to do with surrogate characters,
which completely matters even more! :)

-- 
Karl Ove Hufthammer

Next message: Karl Ove Hufthammer: "Re: commit: abi: UTF8String class"
Previous message: Rui Miguel Silva Seabra: "abiword dtd"
In reply to: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Next in thread: Scott Rushfeldt: "Re: commit: abi: UTF8String class"
Next in thread: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Next in thread: Joaquin Cuenca Abela: "Re: commit: abi: UTF8String class"
Reply: Scott Rushfeldt: "Re: commit: abi: UTF8String class"
Reply: Andrew Dunbar: "Re: commit: abi: UTF8String class"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Sun Apr 21 2002 - 11:20:49 EDT