Re: Patch: Fix for Bug 1164, 2nd try


Subject: Re: Patch: Fix for Bug 1164, 2nd try
From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Tue May 22 2001 - 05:35:32 CDT


Vlad Harchev wrote:
>
> On Tue, 22 May 2001, Hubert Figuiere wrote:
>
> > According to Vlad Harchev <hvv@hippo.ru>:
> > > On Tue, 22 May 2001, ha shao wrote:
> > >
> > > > On Tue, May 22, 2001 at 03:26:27AM +1000, hippietrail@yahoo.com wrote:
> > > > > Vlad Harchev wrote:
> > > > >
> > > > > If I create an RTF with CJK characters in Word it looks like this:
> > > > >
> > > > > \f5\'82\'c9\'82\'d9\'82\'f1\'82\'b2
> > > > >
> > > > > This is what is not being imported correctly.
> > > > >
> > > > > > And no, each byte of multibyte is not going through iconv, our code is:
> > > > > > if (m_mbtowc.mbtowc(wc,(UT_Byte)ch))
> > > > > > return AddChar(wc);
> > > > > > it internally appends 'ch' to array-of-chars member of m_mbtowc, then calls
> > > > > > iconv and check whether it was able to convert aggregated sequence. If it was
> > > > > > able, then wchar is returned, otherwise 0 is returned (any already aggregated
> > > > > > sequence isn't lost between calls).
> > > >
> > > > The problem is with the 'else' clause of the 'if'. When the
> > > > m_mbtowc.mbtowc() return 0, the 'else' reset the m_mbtowc.
> > > > Now we lost the internel buffer. Comment it out will bring
> > > > the cjk import to its old shape.
> > >
> > > The code in AW 0.7.14 is
> > > if (no_convert==0 && ch<=0xff)
> > > {
> > > wchar_t wc;
> > > if (m_mbtowc.mbtowc(wc,(UT_Byte)ch))
> > > return AddChar(wc);
> > else
> > m_mbtowc.initialize();
> > > } else
> > > return AddChar(ch);
> > >
> > > I don't see how 'else' can reset buffer (AddChar(ch) doesn't reset it
> > > either). Or did RTF importer change there since 0.7.14?
> >
> > It has changed. See above.
> > Committer is dom on May 3 (revision 1.58). Dom, can you explain what it is for ?
>
> Thank you for research. That added chunk:
> else
> m_mbtowc.initialize();
> should be definitely removed.

I'm not an iconv expert but I thought we needed something like
that to reset the internal state after trying and failing to
convert a character. Something *like* that - but not exactly that...

Andrew.

-- 
http://linguaphile.sourceforge.net

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com




This archive was generated by hypermail 2b25 : Sat May 26 2001 - 03:51:06 CDT