Re: CJK patch test error report!


Subject: Re: CJK patch test error report!
From: Vlad Harchev (hvv@hippo.ru)
Date: Tue Oct 31 2000 - 05:21:52 CST


On Tue, 31 Oct 2000, Belcon wrote:

 Hi,

> Hello:
> I am sorry for this big mail!
> If you receive this letter,please reply it to me Vlad.
> Otherwise,I am not sure that whether this mail is sent or not.
> Then I will send again.Thanks!

 Hmm, sorry I didn't understand what you mean.
 
> Vlad Harchev дµÀ¡Ã
> >
> > On Tue, 31 Oct 2000, Belcon wrote:
> >
> > Hi,
> >
> > > Hello:
> > >
> > > Vlad Harchev дµÀ¡Ã
> > >
> > > > What are your ideas on fixing input of CJK chars? Have you tried them?
> > > >
> > >
> > > I am still trying.
> >
> > And what are ideas?
>
> Vlad,I have solved this problem.Just changed a line of
> src/af/ev/unix/ev_UnixKeyboard.cpp
> Here it is:
> 404 if (keyval > 0x0000FFFF)
> 405 // return UT_TRUE; //
> yes, it is a virtual key.
> 406 return UT_FALSE;
> HJ's patch has changed this.Maybe you forget to change.
> After change the return value,I can input Chinese happily.
> :-)

 Hmm, it seems I forgot to include it. I'm very sorry.
 But how do you know that you can input Ch if AW doesn't show it? :)
 Unfortunately, my next-cjk-patch doesn't include it yet, so all CJK testers
will have to do it manually for now.

>
> >
> > > > > In Big5:
> > > > > Work fine:
> > > > > * Same as GB2312
> > > > > * Show Big5 Chinese characters quit well.
> > > >
> > > > Quite well where - in the document?
> > > > Are there any flaws?
> > >
> > > Yes,I found some bugs.
> > > * AbiWord can't load .abw file,even if it is converted from a opened
> > > Big5
> > > txt file.AbiWord complained that the .abw file is not an illegal
> > > .abw file.
> > > But I can load GB2312 .abw file quite fine.
> >
> > I don't understand you. First you say "AW can't load .abw, even ..", then "
> > can load GB2312 .abw file quite fine". Could you please explain it?
> >
>
> I am sorry for my poor English,Vlad.
> Now,I try my best to explain the bug.
> Sorry,I make a mistake.
> If I open Chinese (include GB2312&Big5) txt file ,
> "chinese.txt" for example,in AbiWord,
> and then try to save it as chinese.abw file.Then,I close AW.
> Next step,I use this command "AbiWord chinese.abw" to open
> chinese.abw which is saved last step,
> AW said "this file is not an illegal abw file" and
> refused to open this abw file.
> I think that we add some extra information in chinese.abw
> while these information are not addmitted by AW.I haven't debug
> it.
> Vlad,if you still can't understand what I said,I must say sorry
> to you.And I will try another way to explain this bug.

 Belcon, thanks for your explanation. I've understood you now. I've already
told the way to fix that problem - just remove 'encoding="YOUR-ENCODING-NAME"'
from 1st line of .abw file.
 New version of my next-cjk-patch does this automatically (to be correct, it
doesn't save 'encoding="YOUR-ENCODING-NAME"', but you'll have the problem with
already composed files since it won't ignore that part on reading so fix
them by hand).

> > Also, the typical .abw file AW saves starts with
> > <?xml version="1.0" encoding="YOUR-ENCODING-NAME"?>
> >
> > Current code makes AW refusing to load if YOUR-ENCODING-NAME is multibyte
> > encoding, though there are no multibyte strings in the file.
> > So for now, before opening .abw files you've saved, edit first line of the
> > file to open to read:
> > <?xml version="1.0"?>
> >
> > i.e. remove 'encoding="YOUR-ENCODING-NAME"' part.
> >
> > Please report results.
>
> Yes,it works.Do we need to make a patch to let AW know this is a legal
> abw
> file?

 Already done in my next patch as I said. AW will never write that part (but
when it encounter it, it will just treat file as illegal).
 
  
> >
> > > * Now,AbiWord will crash down if I select several characters(highlighted
> > > them with mouse),then select menu item "Copy" or use Ctrl+c.Even the
> > > characters are english letters.Both in GB2312 and in Big5,the bug
> > > exists.I attached the debug message
> >
> > Hmm, very strange. Could you make a backtrace? The debug message is almost
> > useless since it doen's show where AW crashed.
> > To do this, run "echo bt ; gdb -c AW-CORE-FILE-NAME" as I remember (or just
> > attach to running AW and make it crash, and then type 'bt').
> >
>
> Here is the information:
>[...]

 OK, thank you. Now I know the reason why it crashes.
 Here is a short description:

 AW makes cut&paste using saving portiong of document to rtf and importing
resultant rtf to another place.
 When saving RTF, AW allows to use encoding different from current (that is
important for Russian for example). The name of that encoding understood by
iconv is returning by
        XAP_EncodingManager::charsetFromCodepage(lid)
 lid is always 0x404 for chinese.
 wvLIDToCodePageConverter(0x404) will return CP936
 After that the return value of wvLIDToCodePageConverter(0x404) is looked up
 in MSCodepagename_to_charset_name_map in the xap_EncodingManager.cpp.
 In your case it finds "GB" and uses it as the encoding name understood by
 iconv. In your case, iconv doesn't recoginize it (returns (iconv_t)-1).

  So, to fix things use the following algorithm:

 If "CP936" is known to your iconv implementation (check output of iconv
--list), then remove entry
            {"CP936","GB"},
 from MSCodepagename_to_charset_name_map in xap_EncodignManager.cpp
 And AW won't crash on export.

 If your iconv doesn't know CP936, replace a line
        {"CP936","GB"},
 with
        {"CP936","GB2312"},
 and coping to clipboard and saving rtf will work fine (but it's not
 guaranteed that MSWord will read these RTFs).

 If you make AW safely exporting RTFs, try how Word or other rtf-aware apps
 understand it.

 Also try pasting from AW to AW. Does it work?

> > > >
> > > > > problem:
> > > > > * We can't input Chinese character in Frame.(This is easy to
> > > > > fix.)
> > > > >
> > > > > Maybe I will try Japanese and Korean,but not now.
> > > >
> > > > * What is your $LANG is set to?
> > >
> > > $LANG=zh_CN.GB2312 or $LANG=zh_TW.Big5
> >
> > There is a big chance that your iconv implementation doesn't know encoding
> > named "GB2312" (for example, it knows it under different name, say "GB-2312").
> > If your AW uses iconv from glibc, then type
> > iconv --list
> > and check whether GB2312 is in the list of encodings. If not, find something
> > with similar name e.g. GB-2312 or GB_2312 and tell me what's it. If you use
> > libiconv, then it definitely doesn't knwo GB2312 - it knows only
> >
> > EUC-CN, HZ, GBK, EUC-TW, BIG5, CP950, ISO-2022-CN,
> > ISO-2022-CN-EXT
> >
> > though GBK may mean GB2312.
> > Then please relink AW with iconv from glibc (or hack libiconv to understand
> > GB2312 - that is less desired).
> >
>
> Now ,this bug has been fixed.

 OK.
 
> > > > * In which subdirectory did you place chinese fonts (did AW loaded them at
> > > > all? - are CJK fonts available in font selection combobox)?
> > >
> > > I think chinese fonts are ok.There is no problem now.
> >
> > From the "boot log" it seems it's OK - AW seem to find fonts.
> >
> > > > * Does your iconv knows your charset (seems so, otherwise you won't see Ch
> > > > chars in GUI elements such as menu)?
> > > > * What iconv implementation do you use (the one in glibc or standalone from
> > > > libiconv?).
> > >
> > > I have searched libiconv.so in /lib and /usr/lib,and found
> > > nothing.Then,I must
> > > use glibc.
> >
> > libiconv is by default installed in /usr/local/lib.
> > To see whether AW uses libiconv, type 'ldd /PATH/NAME-OF-AW-BINARY' - it will
> > list all libraries AW is linked with. If 'libiconv' is in the list, then AW is
> > linked with it.
> >
>
> AW doesn't link to libiconv.

 OK.
 
> > > > * Could you paste Ch chars from other apps? Could you import .txt files into
> > > > AW?
> > >
> > > No,I can't paste any chars(include english characters) from other apps.
> > > I can import .txt files into AW.
> >
> > Can you import .txt file with Ch chars into AW? In both GB2312 and Big5?
> > Can you paste anything into Big5 version?
>
> I can paste nothing to Big5 version.It seemed that there is something
> wrong
> with our pretty clipboard.:-)
>
> >
> > > > * Could you save Ch chars as .txt or copy them to other documents
> > >
> > > Yes.
> >
> > In both GB2312 and Big5?
> > Does it copy Ch chars _correctly_? Does pasting to other apps works?
> >
>
> Yes,both GB2312 and Big5.And it works very well.
> Cause I only need select these characters with mouse,no need to use
> Ctrl+c,
> so AW won't crash.( I install a daemon which make mouse can pass data
> between
> different applications,so,I think that I can get characters and paste
> them to
> other applications has nothing to do with AW,IMHO)
> I think our AW's clipboard has some bugs.:-(
>
> > > > * Could you print correctly in Big5?
> > >
> > > I haven't got a printer.I will test this later.
> >
> > You don't have to have a printer.
> > You can print to .ps file (jiust select "file" in print dialog) and look at
> > the raw file and/or see it in gv (ghostview - part of GhostScript package).
> >
>
> I am using gs 5.05,which doesn't support TTF fonts.So I can see nothing
> in gv.

 Hmm, are those files a ttf ones? I thought that they are Type1 ones.
 How did you composed fonts.dir for you CJK fonts? The syntax of lfds should
 be the same as in fonts.hj and first 4 fonts in fonts.dir are treated
 specially as I understand (the corresponding english fonts are associated
 with them - see xap_UnixFont*.cpp for use of s_defaultNonCJKFont and
 s_defaultCJKFont).

> :-( I will test tomorrow.

 OK. But I though you can read raw .ps files with eyes and check whether they
are correct.
 
> >[...]
> > Thank you.
> >
> > Best regards,
> > -Vlad
> Best regards!
> -Belcon
>

 I'm sorry for my compact and may be impolite answers - I have very few time.
Please exuse me.

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Tue Oct 31 2000 - 05:40:56 CST