Re: undo and combining characters

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Tue Apr 23 2002 - 07:18:27 EDT

  • Next message: Andrew Dunbar: "Re: How to get there from Here."

     --- Karl Ove Hufthammer <huftis@bigfoot.com> wrote: >
    Andrew Dunbar <hippietrail@yahoo.com> wrote in
    >
    news:20020423032923.18748.qmail@web9606.mail.yahoo.com:
    >
    > > Combining characters are now possible with Unicode
    > for
    > > western languages but nobody is using them yet.
    >
    > Well, I have been using them for a few occations.
    > Mozilla supports
    > them, at least somewhat (it uses superimposing
    > glyphs).
    >
    > >> As for undoing a decomposed character (e.g. e´),
    > I
    > >> think it's safe
    > >> to undo all characters back to (and including)
    > the
    > >> last non-
    > >> combining character. For example if you write e´
    > >
    > > Don't confuse your precomposed é above with
    > combining
    > > characters.
    >
    > I don't. If a document contains e´ (where ´ is a
    > combining ´),
    > pressing backspace should delete both characters,
    > not just the ´.

    This is not how Vietnamese works on Windows which is
    the only combining character language in wide use.
    Does anybody know how Unix or Mac handle Vietnamese?
    I don't have a problem with it working this way but
    it will be more work and not what Vietnamese users are
    already used to.

    > > Normalization is a different subject which mostly
    > > comes into play with searching and sorting - it's
    > > probably only going to be confusing to mention it
    > > here.
    > > Though maybe we do need to discuss whether AbiWord
    > > should normalize all characters in its internal
    > > representation...
    >
    > Since AbiWord uses XML as its file format, the
    > document should
    > (not must) be normalized according to Normalization
    > Form C. There
    > was some talk about making it a requirement for the
    > next version
    > of XML, cf. <URL:
    > http://www.w3.org/TR/xml11/#sec2.13 >, but this
    > will most likely not happen.

    Normalization Form C means "fully composed"
    characters - I think fonts are currently rare that
    would support all characters we need fully composed.
    I would think at this stage that a "compatibility"
    normalization would be more suitable at this early
    stage unless we actually ship fonts that will work
    "fully composed" for all our languages.

    The alternative might be to roll our own glyph
    subsitution, displaying combined characters when our
    fonts have those but not composed characters and vice
    versa - but this would be extra work.

    Andrew Dunbar.

    > --
    > Karl Ove Hufthammer

    =====
    http://linguaphile.sourceforge.net http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Tue Apr 23 2002 - 07:19:42 EDT