Re: May I take part in AbiWord CJKV develope? (fwd)


Subject: Re: May I take part in AbiWord CJKV develope? (fwd)
From: Vlad Harchev (hvv@hippo.ru)
Date: Mon Oct 23 2000 - 04:15:46 CDT


On Mon, 23 Oct 2000, Belcon wrote:

 Hi Belcon,

> Hello Vlad:
> > > I am glad to hear this.:-).But I still need learn something else.
> >
> > Moreover, this (or previous) week someone announced the existance of patches
> > for support of chineese in AW! (Seems this is your 1st target).
> > Here is a location of the patch:
> > http://www.hj.webprovider.com/develope/index.html
> > I will forward that message with URL of screenshot to you too.
>
> Yes,I already know that place.I think this is a great work.

 As I understand it, this patch only supports Big5 and not GB2312, right?
 
> > > Yes,GTK support CJK.(As you know,some people of China like me are
> > > trying
> > > their best to localize AbiWord,and they make progress.AbiWord now can
> > > use
> > > Big5 Truetype Font very well.The only problem about GB2312 is that
> > > AbiWord
> > > doesn't show anything on screen if there is a GB2312 character.:-( .I
> > > guess
> > > that is something wrong with using GTK in a wrong way.)I never tried
> > > Japanese and Korean,but I found that GTK+-1.2.8 have files named
> > > "gtkrc.ko"
> > > and "gtkrc.ja".Since GB2312 and Big5 work well,I guess KSC5601 and
> > > JISX0208
> > > work well too.:-)
> >
> > Nice to hear that. As I understand, there exist some patches to AW to make it
> > partially support CJKV?
> > As for reason of not showing GB2312 - non-patched AW works this way, not GTK
> > :(
> >
>
> A developt of Taiwan CLE group has made a patch support GB2312 and
> Big5.But
> there is very big problem just like what I said above.I don't know why
> AbiWord
> can't show GB2312(but it can print GB2312 properly) while Big5 is fine
> after
> patched.
> I am working on this problem now.And I think if this problem
> solved,AbiWord
> support CJKV is simple IMO.

 I think the only location you should look is
        GR_Graphics::remapGlyph in file src/af/gr/xp/gr_Graphics.cpp
  once you successfuly can map Unicode character to some wide character in
your locale. The result of mapping should be ready for drawing by
        gdk_draw_text_wc()
in GR_UnixGraphics::drawChar() or GR_UnixGraphics::drawChars
That's all.
  That's rather easy to achive IMO. The problem can be caused by GTK/GDK - so
just look into sources of gdk_draw_text_wc() (they are very easy to
understand). Chances are that XLib is broken, and that causes your problem.
 Feel free to ask any questions if they arise while you tweaking this..
 The quick solution could be looking at mozilla sources - it uses gtk on
unixes.

 Once you've solved this, AW will support CJKV nicely.

> > > Yes,MS Word have CJKV version.I must admit that it supports CJKV very
> > > well
> > > although it is painful to wait for MS Word's starting up.
> > > Cause I only have MS word's GB2312 version,I have no way to give you
> > > other
> > > CJKV documents except GB2312.I have no place to up load this document,so
> > > I
> > > have to attach it to this letter.Luckily,it is not very big.
> >
> > Yes, it's very small. Thanks for it.
> > You told about GB2312 version of MSWord - are there other versions of MSWord
> > that support other CJKV encodings? If yes, can they read correctly files
>
> As I know,yes.
>
> > produced by other MSWords with support for different encodings?
> >
>
> No,we always use the third application to show Big5 MSWord document
> while using
> MSWord GB2312 version.

 This pain is expectable from MS products :(
 
> > As long as all supersets of CJKV fit are represented as UCS-2, it's OK :)
> > Even if they don't - that's not fatal.
> >
> > > Two months ago,I make gs5.05 can show CJKV Font.But it need Type1 font
> > > support.
> > > So,even I send ps document to you,you still can't read it.
> >
> > I just wanted to look what PS prolog is emitted to allow CJKV fonts in ps
> > files. So, if you can generate small .ps file not containing fonts itself, I
> > would be willing to see it (not in gv, but just raw .ps).
> >
> I am not sure about your meaning.But I can give you my test ps
> file.(Sorry for
> I am lazying to make Big5,JISX,KSC nomalized like GB2312.)

 OK, received, thanks. But I thought some postscript code was required to
specify the encoding of the data... On the other hand, it's nice that no
special prlogue is required :)

> > > > The biggiest problem would be printing - the generation of PS prologue that
> > > > will make fonts working in GS.
> > > >
> > > Yes,we need GS support.
> > >
> > > > Also, are there X font servers that can display CJKV Type1 fonts (most
> > > > probably, yes)?
> > >
> > > Yes,there are many X font servers that can display CJKV fonts.
> >
> > Can they display Type1 fonts too?
> >
> Sure,they can!
>
> > > >
> > > > Also, I have an impression that support for Vietnameese is already available
> > > > in AW due to my non-latin1 singlebyte characters megapatch since all
> > > > vietnameese characters can be represented as one byte.
> > > >
> > > > Also, as I remember, CJK "words" can be wrapped at any letter (so layout
> > >
> > > Sorry for my poor English,I can't understand what you said.What do you
> > > mean here?
> >
> > For english, if there is no space at the end of current line for some
> > word, that line became "finished" and next line starts with that word. Of
> > course, if hyphenation is disabled.
> > For Japaneese as I remember, rules fairly allow you to put as many letters
> > of that word as fit on current line, and remaining letters of that word start
> > on next line (i.e. rules allow you to break word at any character).
> > So, for example for monospaced fonts and English, if there are 4 cells left
> > on current line and the next word is "abiword", those 4 cells will remain
> > empty since "abiword" is 6 characters long, and next line will start with
> > "abiword". For japaneese and same situation, "abiw" would be placed on current
> > line and "ord" will go to next line.
> >
> > So, does CKV rules allow breaking words at any letter?
> >
> Thanks again!I think it is impossible to let application to know where
> it should
> wrap the Chinese word(include GB2312 and Big5) cause Chinese words'
> meaning is
> very complex.(But one point is important,we can't break any chinese
> character into
> two parts cause it is represented by two bytes)

 That's obvious. Moreover, AW stores data as unicode, so each character is
represented as a single unsigned short value.

> I am not sure of Japanese and Korean cause I am a Chinese.Anyone on
> this mailisting
> has idea about that?

  I think I'm 99% sure that it's possible to split Japanese words at any
letter, since I remember discussions on this topic with one japanese guy on
lynx-dev mailing list.

> Cheers
>

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Mon Oct 23 2000 - 05:11:35 CDT