Re: feasible smart quote solution

From: Tomas Frydrych (tomas@frydrych.uklinux.net)
Date: Wed Aug 14 2002 - 16:31:40 EDT

  • Next message: Eric Zen: "Re: AWN thoughts, questions, suggestions."

    > This assumes that a smart quote algorithm will always be smart
    > enough. It won't, not even for English. I believe there is an
    > article on this at the Unicode Web site. It *must* be possible for the
    > user to change the quotation mark used (perhaps to ", if a inch
    > character is meant).

    Sure, we provide an algorithm that works well for the most common
    cases; we provide a way of turning the whole thing off if the user
    does not like it; we provide a way to stop converting a straight
    quotation mark into a curved one in an individual case (such as '' for
    inches).

    There are three basic scenarios:

    (1) the user types a straight quote, we display a round one and she
    likes it.

    (2) She eally wants a straight quote. To stop displaying a round one
    she will type alt+spacebar, straight quote, and gets a straight quote.

    (3) the user types a straight quote, gets a round quote but really
    wanted a different round quote. In this case she has to explicititely
    input the round quote she wants via the keyboard.

    Now, (2) is not going to be exessively common, except say the
    inches, and it will always require user intervention; the only case in
    which this can be avoided is if we do not provide any smart quote
    support at all, then all straight quotes will be straight. If the quote
    selection is based on the same algorithm as Arabic glyph shaping,
    then (3) will mainly happen in the case where the round quote
    character code overlaps with an actual letter code (say a breathing
    mark), in this case it is more than reasonable to expect the user to
    input the actual character, not a straight quotation mark.

    > Correct quote character *should* be saved in the document. On the fly
    > conversion will *never* work adequately.

    That depends what a correct character is; it could well be argued
    that correct character is the character that the user initially input.
    One of the things I do not like about Word's handling of round
    quotes is that once the character has been converted, you need to
    delete and reinsert it to get it changed.

    The other issue is what is meant by adequately. If adequately means
    handling nearly all possible cases correctly then you are entirely
    right, since the problem is not deterministic in terms of the
    information that is stored in the document, and so can never be fully
    described by an algorithm that works from that information alone.

    If adequately means that it will make the user's life easier most of the
    time and harder only occassionaly, then I think the propossed
    approach will do quite well, because I am fairly certain it will handle
    correctly all cases where the original straight quote stands for an
    actual quotation mark (in the sense that it used for quoting, in
    contrast to representing a letter or inch mark).

    > > Further, with very little extra effort, the quote translation
    > > could be locale- specific,
    > It should be language specific, not locale specific.

    What I meant was that the lang property of the text will be taken into
    account.

    Tomas



    This archive was generated by hypermail 2.1.4 : Wed Aug 14 2002 - 16:36:58 EDT