Re: smart quote algorithm


Subject: Re: smart quote algorithm
From: Karl Ove Hufthammer (huftis@bigfoot.com)
Date: Mon Aug 14 2000 - 12:34:32 CDT


----- Original Message -----
From: "WJCarpenter" <bill-abisource@carpenter.ORG>
To: "AbiWord Mailing List" <abiword-dev@abisource.com>
Sent: Friday, July 21, 2000 7:01 AM
Subject: Re: smart quote algorithm

| koh> We can probably use all characters defined as 'Punctuation' in
| koh> the Unicode standard. These are marked as 'Po', e.g.:
|
| koh> 0021;EXCLAMATION MARK;Po;0;ON;;;;;N;;;;;
|
| Too bad there is no UT_UCS_ispunct() and friends that we can
| transplant into Abi. The idea of using the character description
| files from <http://www.unicode.org> has a pretty steep overhead,
| though that is really the only way to get it exactly right.

Would <URL: http://ustring.charabia.net/ > be of use?

"What can a Unicode library do for me?

Unicode stores characters on 16 (or 32) bit, which implies it can handle
european, chinese, hebrew, etc. characters. The character database gives
important informations on the unicode characters, and allows a complete handling
of case mapping (upper, lower, and title case transformations). Normalization
forms can decompose the characters into letters and marks (diacritics), and
recompose them. If your programs use multiple charsets, multiple languages, or
need informations on character properties (eg. to have the upper case of a
letter, of remove the diacritics from a string), then you probably need
Unicode."

--
Karl Ove Hufthammer



This archive was generated by hypermail 2b25 : Mon Aug 14 2000 - 12:34:44 CDT