Re: [Fribidi-discuss] fribidi and arabic joining

From: Tomas Frydrych (tomas@frydrych.uklinux.net)
Date: Tue Mar 26 2002 - 04:29:30 EST

  • Next message: Philippe DEFERT: "abi post 0.99.3 CVS March 25, IRIX 6.5"

    Hi Behdad,

    thanks, I found that very helpful.

    > But things are not so easy, the Arabic Joining Alg.
    > itself needs the "Left" and "Right" character of a character in text,
    > which Left and Right are defined in the visual text, not logical, the
    > left and right characters cannot be found easily from the next and
    > previous character of the logical order, because of the override marks
    > (LRO and RLO).

    Its not just the LRO/RLO, but all embeding level boundaries, since
    there the character on one (visual) side of the character being
    shaped is unrelated to the next/prev. logical character.

    > Example:
    > <LRO> a b C D <RLE> f g H <PDF> x Y z <PDF>
    > => <LRO> z Y x <RLE> f g H <PDF> D C b a <PDF>
    > also <LRO> a b <RLO> f g <PDF> h <LRE> x y <PDF> Z <PDF>
    > => <LRO> Z <LRE> x y <PDF> h <RLO> f g <PDF> b a <PDF>

    If I have a logical sequence abcKLMxyz (caps = RTL, overall LTR),
    then visual: abcMLKxyz, i.e., the visual left and right of K and M
    cannot be worked out in the logical space because 'c' is no way
    related to 'x' and 'L', and 'x' is in no way related to 'c' and 'L'. If the
    joining algorithm is defined in visual space, then I do not see how
    you can get around having to completely reorder first.

    > The first idea needs some work to prove the independency (which
    > may not be true).
    Shaping is _not_ independent of line breaking, here is why:

    (1) Unidirectional text: the BIDI transformation does not change the
    visual sequence of glyphs, but just the orientation of the coordinate
    system from which it is observed, i.e., the context of individual
    characters remains identical. Line breaking makes no difference
    here, since it does not impact the visual ordering (the individual
    lines are mere snapshots of segments of the 'original' single long
    (visual) line).

    (2) true bidi text: (1) applies to all characters except those on
    direction boundary. For the characters on the boundary, we have
    two cases:
       (2a) character starts an embeding level: the context given by the
       next (logical) character, and the last (logical) character of the
       preceeding embeding level

       (2b) character ends embeding level: the context given by the
       previous (logical character), and the first (logical) character of the
       next embeding level.

    Now, the line breaking here does not change the visual sequence
    of the text if the line break falls withing a segment of visual
    direction identical to the base direction, it has exactly the same
    effect as in case (1).

    If the line break falls into a segment with a different visual direction
    than the base direction, the visual order of glyphs is impacted (we
    no more have simple snapshots of segments of the original long
    visual line). The question of whether this has any impact on the
    shaping then boils down to one thing: is the contextual value of the
    glyph immediatelly before the last embeding level change on the
    previous line identical to the contextual value of the character on
    which a line break is allowed. If it is (A), then the joining is
    independent of line breaking. If it is not (B), then joining is affected
    by line breaking:

    abcKLM OPQ xyz => abcQPO MLK xyz
    with line between M and O:
    abcMLK
    QPO xyz
    i.e., the origianal line would have had medial Q and final M, the new
    line has medial M and final Q.

    In theory (B) is entirely legitimate, and so joining is not
    independent of linebreaking. In real life, the assumption that (A) is
    true will often be satisfied, because we can expect a character
    such as space on the embeding level boundary.

    It seems to me that the only way to completely resolve the circular
    dependence here is by defining the joining algorithm in logical
    space.

    Tomas

    But the second one which is a bit complex
    > seems to produce the desired result. I will provide the test
    > cases for different cases in another mail.
    >
    > [End of BiDi vs. Arabic Joining interaction material, the rest is
    > fribidi related.]
    >
    > B. Our implementation of the Arabic Joining Algorithm is quite
    > small and light, that will not harm the objectives of fribidi at all,
    > but makes it much more useful, either the command line tool (that can
    > be used to cat right to left files), and the library. Many
    > applications that use fribidi do not support Arabic Joining as there
    > is no light-weight implementation of it availble, or the author just
    > wanted it to work for hebrew. But with Arabic Joining in fribidi the
    > developer can just easily turn the arabic joining on to work well for
    > arabic too.
    >
    > C. The Pango is not a real solution for the audience of fribidi:
    > fribidi has been ported to some mobile devices. Also fribidi has been
    > used on linux console and xterm, that is not a good idea to use pango
    > for arabic joining there. fribidi is mostly used for hebrew and
    > arabic scripts, which their rendering will be completed with arabic
    > joining algorithm, then we should not worry about other shaping
    > matters, when shaping of all the Unicode characters is needed, the
    > fribidi feature can be turned off.
    >
    > D. Using the Unicode Arabic Presentation Forms is also essential with
    > Linux console, as the kernel maps the Unicode codepoints to glyphs,
    > for other scripts like syriac which does not have the presentation
    > forms in unicode, their presentaion forms should be registered in the
    > private area of unicode (H. Peter Anvin is responsible for registering
    > them in linux), to be able to show them in linux console.
    >
    > E. About the overhead of it on fribidi, I believe that the
    > hebrew community should not be so happy, but:
    >
    > 1. It can be fully turned out with a configure time
    > option.
    > 2. When compiled with Arabic Joining, by default its
    > off, the developer should turn it on if needed.
    > 3. I try to put it in a different binary to save the
    > resources.
    >
    >
    > I hope that with the above discussion there will be enough
    > reasons for all of you to put it in fribidi.
    >
    > Yours,
    > -- Behdad Esfahbod 6 Farvardin 1381, 2002 Mar 26
    > <behdad at bamdad dot org> [Finger for Geek Code]
    >
    >
    >
    > _______________________________________________
    > Fribidi-discuss mailing list
    > Fribidi-discuss@lists.sourceforge.net
    > https://lists.sourceforge.net/lists/listinfo/fribidi-discuss
    >



    This archive was generated by hypermail 2.1.4 : Tue Mar 26 2002 - 04:33:37 EST