RE: dogfood feedback -- Smart Quotes


Subject: RE: dogfood feedback -- Smart Quotes
From: Paul Rohr (paul@abisource.com)
Date: Mon Mar 05 2001 - 15:15:15 CST


At 03:46 PM 3/3/01 -0000, Tomas Frydrych wrote:
>Vlad wrote:
>> I think that the best solution is to save in utf8 or as numeric
>> entities under latin1 and CJK locales (I don't know what to prefer
>> here), and save in native encoding on other locales.
>
>I agree with Vlad, this is not only a question of being able to edit
>files in a plain text editor on non-latin1 locales, or of their size, but
>also of being able to use utilities such as grep on these files. I
>personally would prefer utf8 to the entities, because of the resulting
>file size with the entities.

Does anyone else have a preference on this?

For CJK, entitizing certainly seems ridiculous, but I'm no CJK expert.

As for Latin-1, I suppose the closest corollary to the "native encoding"
precedent would be to revert to Jeff's "entitize non-ASCII" solution, rather
than go to full utf8. I know that I personally have gotten totally used to
the occasional entity here & there (in our file format as well as HMTL and
others), so the bloat's not a factor for me. Are other text-munging tools
for Latin-1 more likely to cope well with utf8 or entitized text? (Specific
examples would help here.)

I guess my temptation would be to entitize Latin-1, but I don't have such a
strong preference that I want to block consensus.

Paul



This archive was generated by hypermail 2b25 : Mon Mar 05 2001 - 15:32:41 CST