Re: Patch: Fix for Bug 1164, 2nd try


Subject: Re: Patch: Fix for Bug 1164, 2nd try
From: Vlad Harchev (hvv@hippo.ru)
Date: Tue May 22 2001 - 02:52:30 CDT


On Tue, 22 May 2001, ha shao wrote:

> On Tue, May 22, 2001 at 03:26:27AM +1000, hippietrail@yahoo.com wrote:
> > Vlad Harchev wrote:
> >
> > If I create an RTF with CJK characters in Word it looks like this:
> >
> > \f5\'82\'c9\'82\'d9\'82\'f1\'82\'b2
> >
> > This is what is not being imported correctly.
> >
> > > And no, each byte of multibyte is not going through iconv, our code is:
> > > if (m_mbtowc.mbtowc(wc,(UT_Byte)ch))
> > > return AddChar(wc);
> > > it internally appends 'ch' to array-of-chars member of m_mbtowc, then calls
> > > iconv and check whether it was able to convert aggregated sequence. If it was
> > > able, then wchar is returned, otherwise 0 is returned (any already aggregated
> > > sequence isn't lost between calls).
>
> The problem is with the 'else' clause of the 'if'. When the
> m_mbtowc.mbtowc() return 0, the 'else' reset the m_mbtowc.
> Now we lost the internel buffer. Comment it out will bring
> the cjk import to its old shape.

 The code in AW 0.7.14 is
                      if (no_convert==0 && ch<=0xff)
                      {
                              wchar_t wc;
                              if (m_mbtowc.mbtowc(wc,(UT_Byte)ch))
                                      return AddChar(wc);
                      } else
                              return AddChar(ch);

 I don't see how 'else' can reset buffer (AddChar(ch) doesn't reset it
either). Or did RTF importer change there since 0.7.14?

> Beside, since the AddChar(ch) does not actually insert the
> character into the document (as suggested in its definition),
> can we just add the mb into a buffer and call m_mbtowc.mbtowc()
> at the beginning of FlushStoredChars?

 I think it would be troublesome. Is there are any reasons why the way you
propose may be better than current implementation (besides performance)?

 Best regards,
  -Vlad



This archive was generated by hypermail 2b25 : Sat May 26 2001 - 03:51:06 CDT