Re: feasible smart quote solution

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Thu Aug 15 2002 - 00:26:09 EDT

  • Next message: F J Franklin: "Re: commit: abi: experimental resource manager"

     --- Tomas Frydrych <tomas@frydrych.uklinux.net>
    wrote: >
    > > The solution sounds feasible for actually
    > performing the character
    > > remapping to screen. A larger and probably more
    > interesting problem
    > > would be how to identify these smart-glyphs and
    > differentiate between
    > > these and normal ones (the " inch marker,
    > mismatched start/end quote
    > > pairs, ...
    > >
    > > Do you have any idea of how to feasibly do this?
    > Might the unicode
    > > website help us?
    >
    > There is no way to really differentiate between the
    > two kinds of
    > glyphs, in terms of the information presented by the
    > glyph codes
    > that problem is not deterministic, i.e., you cannot
    > tell if "21" is a 21
    > in quotation marks, or 21 inches openning a quote,
    > or is it?" -- you
    > can only tell that when you know not just the wider
    > context but also
    > _meaning_ of the text in that context. Matching
    > start and end quotes
    > is equally impossible for the very same reason, as
    > long as you
    > cannot tell whether quote is really a quote, you
    > cannot tell which
    > quotes should be paired.

    This is why I call them "amgiuous quotes".

    > Under the circumstances I see only two realistic
    > options; (1) no
    > smart quotes at all. This is pretty much the case at
    > the moment, and
    > I find it rather unsatisfactory. (2) Assume that all
    > straight quotes are
    > quotes or mid-word apostorphies and let the user
    > handle the rest
    > manually. If he does not like it, let him turn it
    > off.

    The main reason we have smart quotes turned off right
    now is not because we weren't getting the context
    right or anything like that. The problem was that we
    have smart quote code sprinkled all over the place
    trying to do the same kind of guessing under various
    conditions. In some cases this proved so difficult
    that we had crashes or endless loops. Undo was the
    main problem. Various attempts were made to fix it
    but it was so hairy it kept cropping up.

    So what we need is a single point of failure. We
    need a "smart quote engine" which the various bits of
    code can call instead of trying to figure it out for
    themselves.
    We need this engine to be really really simple but to
    be language-dependent.
    We need to call this engine from as few places as
    possible.

    > I think as soon as you start striving for something
    > "better" than this,
    > i.e., you start assuming that you can work out when
    > the quote is not
    > a quote (which you cannot), you will end up with
    > behaviour which is
    > less predictable, and consquently more irritating
    > for the user.

    I agree. The engine should know what the previous
    word
    or punctuation is, what word it's in the middle of,
    and
    what what word or punctuation comes next. If that's
    enough too bad. At least it won't crash.

    Andrew.

    > Anyway, that's what I think
    >
    > Tomas

    =====
    http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Thu Aug 15 2002 - 00:29:25 EDT