Re: Patch: Multi-encoding Text import/export


Subject: Re: Patch: Multi-encoding Text import/export
From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Sun May 20 2001 - 12:01:13 CDT


Vlad Harchev wrote:
>
> On Sun, 20 May 2001, Andrew Dunbar wrote:
>
> > Vlad Harchev wrote:
> > >
> > > On Sun, 20 May 2001, Andrew Dunbar wrote:
> > >
> > > > Sam TH wrote:
> > > > > Other than that, this is excellent.
> > > >
> > > > Thanks! I've found that we must make the Text Encoding a per-
> > > > document feature instead of based entirely on the locale.
> > > > I need to know how to add an "encoding" field to AbiWord's
> > > > document class - this will also be very useful for at least
> > > > the HTML and RTF importers and exporters - probably more.
> > >
> > > I think they are needed. Both RTF and HTML formats pretty precisely specify
> > > encoding (RTF - in some backward way) - so it's not necessary. The only use is
> > > if someone exported file by (or wants to export for importing into) some
> > > widely spread non-following specs app. I don't know ones that satisfy both
> > > conditions :)
> >
> > Sorry Vlad. I don't understand if you're saying adding this is
> > a good or a bad idea. I think it's an essential idea so we can
> > load an HTML document in Shift-JIS encoding and save it as a plain
> > text file in EUC-JP encoding on a machine with an English locale.
> >
> > Just the kind of thing I use MS Word for now...
>
> Why user might want to save HTMLs in some particular encoding? HTMLs can be
> put on the web in any encoding (if it's mentioned in the header) - and any
> compliant and reasonable browser will be able to show them regardless of
> encoding. The only case is - the user is web developer and needs to have
> HTML in some particular encoding (for hand-editing) - but such people should
> also have tools for converting texts between various encodings.
> So, church secretary shouldn't bother about knowning what encoding is. Just
> save in utf8 most of the time (or allow to select text encoding to write
> HTML in using abiword preferences file - without any GUI to save
> programmer's efforts) (and probably better write generic portable
> utility to convert text between arbitrary encodings, that won't relate to
> AbiWord project).

HTML is only one example. It may be company policy that a certain
encoding is used. Handheld devices with embedded browsers may only
support certain encodings. RTF also has the concept of a native
encoding. If I load a document in one filetype and it's in a certain
encoding it's reasonable to expect that saving in a different filetype
will preserve the encoding. And what about plain text? Users might
like to
load a Japanese web page and save as plain text for use with legacy
software that doesn't yet support UTF-8.

I'm not a church secretary I suppose but I do handle a lot of text
in many encodings and many formats. I'm a Microsoft hater and find
MS Word excellent for this job - I want a free alternative.

Anyway I'm willing to implement this feature - I just need some
pointers on how/where to add a new field to the internal document
class.

Andrew Dunbar.

-- 
http://linguaphile.sourceforge.net

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com




This archive was generated by hypermail 2b25 : Sat May 26 2001 - 03:51:05 CDT