Re: chell specker


Subject: Re: chell specker
From: Paul Rohr (paul@abisource.com)
Date: Wed Jul 11 2001 - 16:38:12 CDT


At 03:57 PM 7/11/01 -0400, Pierre Abbat wrote:
>I just compiled and installed AbiWord and brought up my resumé, in which I
>mention being on strings police for AbiWord, and it called it a misspelling.
>How come AbiWord doesn't know how to spell AbiWord? ;-)

Uh, that'd be a bug.

Actually, the original hashfiles Shaw built (many moons ago) *did* add the
following two words:

  AbiWord
  AbiSource

This is a lot less than other word processor vendors do:

  http://www.alternet.org/story.html?StoryID=11130

Still, I think it'd be a fine idea to guarantee that the hashes *we*
distribute, if any, are rebuilt to also include a small set of words like
this. (BTW, before doing so, we should probably check the distribution
terms of the dictionary we're augmenting to make sure that's OK.)

More specifically, it would help to have a "make dict" target in the tree
which triggers something like the following process:

  - Check out a list of "our" supplemental words from CVS.
  - Look for ispell affix files in some well-known place (SWKP).
  - Locate the appropriate ispell/munchlist scripts in SWKP.
  - Use all 3 to create ispell hashfiles in the format we expect.

Moreover, if we're going to go to this trouble, then while we're at it we
should rename the !@#$ things properly. For example,

  en-US.hash instead of american.hash
  en-BG.hash instead of british.hash
  etc.

notes
-----
A. We should still keep around the existing "legacy" lookup tables so
people can use their distro's existing american.hash, etc. files if they
want. (No guarantees about whether it'll recognize the name AbiWord,
though.)

B. In this proposal, 'make dict' is an optional target. It's not required
to build a working tree, but it is available for anyone who wants to build
Abi-compatible dictionaries. For the rest of us, see F below.

C. We define and document SWKP (presumably as peer modules, no?) for the
necessary affix files and scripts, so that our makefiles know where to find
them and mix in our supplemental words.

D. The real trick here is to design the 'make dict' logic to make it as
easy as possible to add a new affix file and have it spit out a hashfile
with the right name and format. For example, if I ran across a (mythical)
welsh.affix file somewhere on the Net, it would be nice if I could just:

  - rename it to cy-GB.hash
  - drop it in SWKP (a peer directory of abi, I assume)
  - cd abi
  - make dict

... and have it spit out a nice Abi-compatible cy-GB.hash I could use for
spell-checking.

E. If we do wind up using munchlist, that would mean adding an optional
(see B above) dependency on Perl. However, that shouldn't faze people on
this list too much. ;-)

F. We definitely still want something like Dom's abispell module to contain
the *results* of this process.

alternative
-----------
If this sounds like too much work, one possible shortcut would be to
approach one of the distros who specialize in l10n (such as Mandrake), and
ask *them* to do the work.

There's no reason for *us* to be in the business of producing en-US.hash,
etc. files that contains the names of our product, so long as *somebody* is
willing to do so.

bottom line
-----------
All I'm proposing is that there be an automatic, repeatable way to *create*
nice Abi-friendly hashfiles for arbitrary locales. This is basically a job
for makefile hackers and/or shell-scripting whizzes.

Any volunteers, or should we make this a POW?

Paul



This archive was generated by hypermail 2b25 : Wed Jul 11 2001 - 16:30:48 CDT