Re: Patch: Fix for Bug 1164, 2nd try


Subject: Re: Patch: Fix for Bug 1164, 2nd try
From: Vlad Harchev (hvv@hippo.ru)
Date: Tue May 22 2001 - 12:07:30 CDT


On Tue, 22 May 2001, Andrew Dunbar wrote:

> Vlad Harchev wrote:
> >
> > > > Andrew's RTF looks different from yours because Andrew is running Win2k - it
> > > > seems XAP_EncodingManager.is_cjk_locale() returns 0 under Win2k and 1 (as
> > > > expected) under unix. Andrew, coduld you correct this (make is_cjk_locale()
> > > > returning proper value)?
> > >
> > > If the \uxxxx output is correct (that other wp can load it with no
> >
> > Wordpad from Win95 won't load it (\uxxx form). May be from win98
> > too. There may be other primitive editors that won't load it.
> > So I think it would be safer not to use \uxxxx form for maximum
> > compatibility. So correctly working is_cjk_locale() seems to be needed.
> >
> > > problem), maybe the is_cjk_locale is not needed? Sorry, I forget the
> > > reason to have that in the place. We must have some reason...
> > > Maybe for export to the 'rtf for old application' format?
> >
> > is_cjk_locale is widely used internally on unix - for printing, drawing etc.
> > As for 'rtf for old apps' - its output doesn't differ from usual RTF
> > exporter's for multibyte locales. Only for singlebyte ones (it's for
> > StarOffice 5.2 that doesn't parse RTF correctly).
>
> It seems to me that we are relying on is_cjk_locale() for a little
> too much. Only in so far as if we are running on a chinese locale
> and editing a japanese document, we will save characters that are
> compatible between both correctly but others will probably save as

 Sorry, what do you mean by "others will probably save as"?

> '?' - and many may not expect this. A more advanced exporter would
> save all characters supportable by the encoding as 8 bit multibyte
> and all non-supported characters as unicode. I believe this is what
> the later versions of MSWord and WordPad do.

 Our exporter does the same for non-CJK locales (as it seems that CJK locales
include a lot of other single-byte ones) - it tries to save as char in
single-byte encoding, and if can't be represented, saves as unicode.
 
> I just want to be sure we are aware of this - it's probably a bit
> much to do right now and the is_cjk_locale() is the next best
> thing in the meantime.

 All cases of its use may need to be audited - but I dubt that they are too
incorect.

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Sat May 26 2001 - 03:51:06 CDT