Re: CJK patch test error report!


Subject: Re: CJK patch test error report!
From: Vlad Harchev (hvv@hippo.ru)
Date: Tue Oct 31 2000 - 02:17:11 CST


On Tue, 31 Oct 2000, Belcon wrote:

> Vlad Harchev дµÀ¡Ã
> >
> > On Tue, 31 Oct 2000, hashao wrote:
> >
> > Hello,
> >
> > > Hello Vlad,
> > >
> > > On Tuesday, October 31, 2000, Vlad Harchev wrote:
> > >
> > > VH> On Mon, 30 Oct 2000, Belcon wrote:
> > > >> problem:
> > > >> * We still can't see any Chinese(GB2312) characters(unvisible) in frame;
> > > >> If we open a GB2312 abw file,it just show blank letters(it is
> > > >> two ASCII letters width)where it should show GB2312 characters.
> > > >> If we select these Chinese characters although we can't see
> > > >> them,and then select Menu item 'Find',and we will see Chinese
> > > >> characters shown in box quite well.
> > >
> > > VH> Could you please elaboraye - in which box - in input entry or the characters
> > > VH> when highlighted in the document?
> > > VH> Does it measure the width of Ch chars correctly (i.e. if you type English
> > > VH> char, then Ch char, then English char - can Ch char fit between the English
> > > VH> ones or not)?
> > >
> > > The problem is in the usage of the function gdk_draw_text. GB (and ecuJP)
> > > encoding is different from its fonts encoding which is in iso-2022.
> > > To draw a EUC encoded GB char, you have to convert it to its font
> > > encoding according to current locale and X11/locale/*/. Then the
> > > function XDrawString16() can be called to draw the glyph.
> > >
> > > EUC encoding starts from A1A1 while iso-2022 starts from 2121. So you
> > > will get nothing by drawing a A1A1 in a iso-2022 encoded font.
> > >
> > > This conversion is internally handled by X's I18N functions like
> > > XmbDrawString(), XwcDrawString(). Gdk hides this further with its
> > > gdk_draw_text() function, BTW, this is what was used in the cjk-patch
> > > in file src/af/gr/unix/gr_UnixGraphics.cpp functions
> > > GR_UnixGraphics::drawChar() and GR_UnixGraphics::drawChars().
> > >
> > > For gdk_draw_text, if it is called with a single font, it will use
> > > non-i18n functions XDrawString() and XDrawString16() without encoding
> > > conversions. If the gdk_draw_text is called with a fontset, it will
> > > use XmbDrawString which has the encoding conversion buildin.
> > > See gdk source: gdk/gdkdraw.c
> > >
> > > So one solution is to simple construct a fontset from EnglishFont and
> > > ChineseFont, then draw the text by calling gdk_draw_text with the
> > > fontset.
> >
> > Thank you very much for your explanation. I think the 1st solution is much
> > more generic so I will implement it.
>
> After a quick hack in gr_UnixGraphics.cpp,I found the problem that
> GB2312
> Characters couldn't be shown on document was solved.But I am not
> familiar
> with gtk/gdk.So ,I will wait till Vlad makes patch.
>
> > As I understand, TurboLinux ships hacked gtk since HJ says that AW with his
> > patch works fine there with GB too.
> >
>
> Maybe!

 Could you try to install that version of gtk and try with it?

> > > Another solution is to do the encoding conversion according to current
> > > locale. Then draw things with the current logic.
> > >
> > > P.S. There are no problem with BIG5 char because the encoding for
> > > both BIG5 text and the fonts are the same. So no conversion is
> > > needed.
> >
> > Do you have any ideas why it's impossible to input anything into AW with
> > my patch? It has exactly the same logic as HJ's patch with respect to
> > keyboard input. May be it it's also due to the fact that HJ's version of gtk
> > was also hacked in this respect?
>
> No,I don't think so.I can input Chinese in AbiWord-0.7.10 patched with
> HJ's
> patch,although I don't know why I can't input Chinese in AbiWord-0.7.11.
> I am still trying to find the reason.

 Hmm, very strange. Have you traced src/af/ev/unix/*keyboard.cpp?

 Also, if not-cleared shift state is guilty for inability to input CJK - then
it still would be possible to input Ch char _first_ _time_ (and if shift state
was guilty, you wouldn't be able to input Ch char after that). Is this the
case (i.e. can you input one Ch character right after AW is started)?

> > Or may be Input Methods should be setup on the frame?
> >
> > Also, just curios, my RH6.0's glibc knows among others 3 encodings:
> > ISO-2022-JP-2, ISO-2022-JP, ISO-2022-KR
> > are they just synonims or they are different?
> >
>
> Sorry,I don't know.
>

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Tue Oct 31 2000 - 03:08:28 CST