Re: May I take part in AbiWord CJKV develope? (fwd)


Subject: Re: May I take part in AbiWord CJKV develope? (fwd)
From: hj (huangj@citiz.net)
Date: Mon Oct 23 2000 - 08:36:12 CDT


----- Original Message -----
发件人: Vlad Harchev <hvv@hippo.ru>
收件人: Belcon <rainfall@yeah.net>
抄送: <abiword-dev@abisource.com>
发送时间: 2000年10月23日 17:15
主题: Re: May I take part in AbiWord CJKV develope? (fwd)

> On Mon, 23 Oct 2000, Belcon wrote:
>
> Hi Belcon,
>
> > Hello Vlad:
> > > > I am glad to hear this.:-).But I still need learn something else.
> > >
> > > Moreover, this (or previous) week someone announced the existance of
patches
> > > for support of chineese in AW! (Seems this is your 1st target).
> > > Here is a location of the patch:
> > > http://www.hj.webprovider.com/develope/index.html
> > > I will forward that message with URL of screenshot to you too.
> >
> > Yes,I already know that place.I think this is a great work.
>
> As I understand it, this patch only supports Big5 and not GB2312, right?

This patch only supports GB2312. You must modify fonts.hj if you want it
support Big5.

>
> > > > Yes,GTK support CJK.(As you know,some people of China like me are
> > > > trying
> > > > their best to localize AbiWord,and they make progress.AbiWord now
can
> > > > use
> > > > Big5 Truetype Font very well.The only problem about GB2312 is that
> > > > AbiWord
> > > > doesn't show anything on screen if there is a GB2312
haracter.:-( .I
> > > > guess
> > > > that is something wrong with using GTK in a wrong way.)I never tried
> > > > Japanese and Korean,but I found that GTK+-1.2.8 have files named
> > > > "gtkrc.ko"
> > > > and "gtkrc.ja".Since GB2312 and Big5 work well,I guess KSC5601 and
> > > > JISX0208
> > > > work well too.:-)
> > >
> > > Nice to hear that. As I understand, there exist some patches to AW to
make it
> > > partially support CJKV?
> > > As for reason of not showing GB2312 - non-patched AW works this way,
not GTK
> > > :(
> > >
> >
> > A developt of Taiwan CLE group has made a patch support GB2312 and
> > Big5.But
> > there is very big problem just like what I said above.I don't know why
> > AbiWord
> > can't show GB2312(but it can print GB2312 properly) while Big5 is fine
> > after
> > patched.
> > I am working on this problem now.And I think if this problem
> > solved,AbiWord
> > support CJKV is simple IMO.
>
> I think the only location you should look is
> GR_Graphics::remapGlyph in file src/af/gr/xp/gr_Graphics.cpp
> once you successfuly can map Unicode character to some wide character in
> your locale. The result of mapping should be ready for drawing by
> gdk_draw_text_wc()
> in GR_UnixGraphics::drawChar() or GR_UnixGraphics::drawChars
> That's all.
> That's rather easy to achive IMO. The problem can be caused by GTK/GDK -
so
> just look into sources of gdk_draw_text_wc() (they are very easy to
> understand). Chances are that XLib is broken, and that causes your
problem.
> Feel free to ask any questions if they arise while you tweaking this..
> The quick solution could be looking at mozilla sources - it uses gtk on
> unixes.
>
> Once you've solved this, AW will support CJKV nicely.

Does gdk_draw_text_wc() support UCS-2 or UTF8? I cann't display GB2312 with
gdk_draw_text_wc(). So I convert unicode to mbs. Display GB2312 with
gdk_draw_text().

>
> > > > Yes,MS Word have CJKV version.I must admit that it supports CJKV
very
> > > > well
> > > > although it is painful to wait for MS Word's starting up.
> > > > Cause I only have MS word's GB2312 version,I have no way to give
you
> > > > other
> > > > CJKV documents except GB2312.I have no place to up load this
document,so
> > > > I
> > > > have to attach it to this letter.Luckily,it is not very big.
> > >
> > > Yes, it's very small. Thanks for it.
> > > You told about GB2312 version of MSWord - are there other versions of
MSWord
> > > that support other CJKV encodings? If yes, can they read correctly
files
> >
> > As I know,yes.
> >
> > > produced by other MSWords with support for different encodings?
> > >
> >
> > No,we always use the third application to show Big5 MSWord document
> > while using
> > MSWord GB2312 version.
>
> This pain is expectable from MS products :(
>
> > > As long as all supersets of CJKV fit are represented as UCS-2, it's
OK :)
> > > Even if they don't - that's not fatal.
> > >
> > > > Two months ago,I make gs5.05 can show CJKV Font.But it need Type1
font
> > > > support.
> > > > So,even I send ps document to you,you still can't read it.
> > >
> > > I just wanted to look what PS prolog is emitted to allow CJKV fonts
in ps
> > > files. So, if you can generate small .ps file not containing fonts
itself, I
> > > would be willing to see it (not in gv, but just raw .ps).
> > >
> > I am not sure about your meaning.But I can give you my test ps
> > file.(Sorry for
> > I am lazying to make Big5,JISX,KSC nomalized like GB2312.)
>
> OK, received, thanks. But I thought some postscript code was required to
> specify the encoding of the data... On the other hand, it's nice that no
> special prlogue is required :)
>
> > > > > The biggiest problem would be printing - the generation of PS
prologue that
> > > > > will make fonts working in GS.
> > > > >
> > > > Yes,we need GS support.
> > > >
> > > > > Also, are there X font servers that can display CJKV Type1 fonts
(most
> > > > > probably, yes)?
> > > >
> > > > Yes,there are many X font servers that can display CJKV fonts.
> > >
> > > Can they display Type1 fonts too?
> > >
> > Sure,they can!
> >
> > > > >
> > > > > Also, I have an impression that support for Vietnameese is
already available
> > > > > in AW due to my non-latin1 singlebyte characters megapatch since
all
> > > > > vietnameese characters can be represented as one byte.
> > > > >
> > > > > Also, as I remember, CJK "words" can be wrapped at any letter (so
layout
> > > >
> > > > Sorry for my poor English,I can't understand what you said.What do
you
> > > > mean here?
> > >
> > > For english, if there is no space at the end of current line for some
> > > word, that line became "finished" and next line starts with that word.
Of
> > > course, if hyphenation is disabled.
> > > For Japaneese as I remember, rules fairly allow you to put as many
letters
> > > of that word as fit on current line, and remaining letters of that
word start
> > > on next line (i.e. rules allow you to break word at any character).
> > > So, for example for monospaced fonts and English, if there are 4
cells left
> > > on current line and the next word is "abiword", those 4 cells will
remain
> > > empty since "abiword" is 6 characters long, and next line will start
with
> > > "abiword". For japaneese and same situation, "abiw" would be placed on
current
> > > line and "ord" will go to next line.
> > >
> > > So, does CKV rules allow breaking words at any letter?
> > >
> > Thanks again!I think it is impossible to let application to know where
> > it should
> > wrap the Chinese word(include GB2312 and Big5) cause Chinese words'
> > meaning is
> > very complex.(But one point is important,we can't break any chinese
> > character into
> > two parts cause it is represented by two bytes)
>
> That's obvious. Moreover, AW stores data as unicode, so each character is
> represented as a single unsigned short value.
>
> > I am not sure of Japanese and Korean cause I am a Chinese.Anyone on
> > this mailisting
> > has idea about that?
>
> I think I'm 99% sure that it's possible to split Japanese words at any
> letter, since I remember discussions on this topic with one japanese guy
on
> lynx-dev mailing list.
>
> > Cheers
> >
>
> Best regards,
> -Vlad
>
>



This archive was generated by hypermail 2b25 : Mon Oct 23 2000 - 08:38:54 CDT