Re: Patch: Fix for Bug 1164, 2nd try


Subject: Re: Patch: Fix for Bug 1164, 2nd try
From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Mon May 21 2001 - 11:11:39 CDT


Vlad Harchev wrote:
>
> On Mon, 21 May 2001, Andrew Dunbar wrote:
>
> > Here's my second try.
> > I've added more cpg's and fcharset's after doing some tests with
> > MW Word and WordPad.
> > I've made all the encoding names "CPxxx".
> > Bugfix 836 is not broken any longer. Note that not even MS Word
> > or Wordpad can load 836.rtf but we can (:
> >
> > I hope that's everything. CJK multibyte locales are not
> > imported correctly yet.
>
> Why? CJK (chinese) people told that RTF was being imported and exported just
> fine.

I'm not sure. It may only be when the CJK text is governed by the
font \fcharset tag or maybe something has broken. My hunch is that
each byte of the multibyte characters is going through iconv separately
but I'm not sure. We might want to look at what happens around
line 834 at the call to m_mbtowc.mbtowc().

> As for the patch - let me suggest few more case statements (for charset ->
> iconv's charset name) switch (excerpt from OpenOffice sources - please adapt
> them apropriately for AW):
>
> case 77: eTextEncoding = RTL_TEXTENCODING_APPLE_ROMAN;break;
> case 130: eTextEncoding = RTL_TEXTENCODING_MS_1361; break;
> case 255: eTextEncoding = RTL_TEXTENCODING_IBM_850; break;

Excellent! Thanks. Can you see if they handle case 2 for the
symbol charset?

Attatched is a patch with these new cases.

Andrew.

-- 
http://linguaphile.sourceforge.net

Index: src/wp/impexp//xp/ie_imp_RTF.cpp =================================================================== RCS file: /cvsroot/abi/src/wp/impexp/xp/ie_imp_RTF.cpp,v retrieving revision 1.63 diff -u -r1.63 ie_imp_RTF.cpp --- src/wp/impexp//xp/ie_imp_RTF.cpp 2001/05/21 14:30:08 1.63 +++ src/wp/impexp//xp/ie_imp_RTF.cpp 2001/05/21 16:00:21 @@ -236,12 +236,18 @@ UT_DEBUGMSG(("RTF Font charset 'Symbol' not implemented\n")); UT_ASSERT(UT_NOT_IMPLEMENTED); break; + case 77: // Mac Roman - as in OpenOffice + m_szEncoding = "MACINTOSH"; + break; case 128: // SHIFTJIS_CHARSET m_szEncoding = "CP932"; break; case 129: // Hangul - undocumented? m_szEncoding = "CP949"; break; + case 130: // Johab - as in OpenOffice + m_szEncoding = "CP1361"; + break; case 134: // Chinese GB - undocumented? m_szEncoding = "CP936"; break; @@ -295,10 +301,8 @@ UT_DEBUGMSG(("RTF Font charset 'PC437'??\n")); UT_ASSERT(UT_NOT_IMPLEMENTED); break; - case 255: // OEM_CHARSET - // TODO Can iconv do this? - UT_DEBUGMSG(("RTF Font charset 'OEM'??\n")); - UT_ASSERT(UT_NOT_IMPLEMENTED); + case 255: // OEM_CHARSET - IBM 850 as in OpenOffice + m_szEncoding = "CP850"; break; default: UT_DEBUGMSG(("RTF Font charset unknown: %d\n", m_charSet));

_________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com



This archive was generated by hypermail 2b25 : Sat May 26 2001 - 03:51:05 CDT