Re: AbiSource, Ispell, Aspell and beyond...


Subject: Re: AbiSource, Ispell, Aspell and beyond...
From: Kevin Atkinson (kevinatk@home.com)
Date: Wed Feb 23 2000 - 03:54:54 CST


Kevin Atkinson wrote:

> Aspell needs more info than a simple minded spell checker to do its job
> as well as it can, like communicating back the replacement pairs.

This is needed becuase aspell is able to learn from users
missspellings. For example on the first pass a user misspelles
beginning as beging so aspell suggests:

  begging, begin, being, Beijing, bagging, ....

However the user then tries "begning" and aspell suggests

  beginning, beaning, begging, ...

so the user selects beginning. However than, latter on in the document
the user misspelles it as begng (NOT beging). Normally aspell will
suggest.

  began, begging, begin, begun, ....

However becuase it knows the user mispelled beginning as beging it will
instead suggest:

  beginning, began, begging, begin, begun ...

BTW I often misspelled beginning (and still do) as something close to
begging and two many times wind up writing sentences such as "begging
with ....".

Not strictly Aspell related but have you considered offering the ability
to look up words. The wordnet dictionary, although not 100% complete,
has VERY good definitions and is very free. I have written some
formatting code which formats wordnets definitions as returned by the
dict server. It would also be fairly easy to use it locally by either
A) running a dict sever in the background or B) reading the definitions
directly from the database....

If you are interested let me know as I have some lookup code in my
gaspell applit. (Yes its nasty C++ but it should be easy to uglify it
to your portably standards. -)

> > 7. We can ship one high-quality dictionary per language per spell engine,
> > without having to worry about endianness or byte size. Likewise, we
> > wouldn't want to have to worry about negotiating IP licenses for the
> > dictionary content.
>
> I do NOT recommend this. You should use the system dictionaries if they
> are available. Also Aspell CAN NOT use a simple word list. Double,
> aspell dictionaries have more than byte order problems and should be
> compiled for each archicile just like the binaries are....

However, Aspell is very module and one dictionary class can easily be
replaced by another with little effort provided that you provide the
necessary services. For aspell that is a simple is the word in the
dictionary query, and give me a list of all words with this sounds like
pattern, plus it is also necessary to iterator through both lists. So
the point is if you REALLY want a device dependent dictionary format you
can provide aspell with one. However as I said before I no NOT think
this is the right path to chose. Compiling the word list is a VERY easy
task in aspell and moderately easy in ispell.

Also, I know a bunch of you (the group of developers) have mangled
ispell quite a bit. Do you think it is worth trying to extract the
affection code from ispell so I can use it aspell or am i better off
reimplementing it. If you think it is extractable could one of you give
me a helping hand. I am extremely bad a deciphering nasty C code....

---
Kevin Atkinson
kevinatk@home.com
http://metalab.unc.edu/kevina/



This archive was generated by hypermail 2b25 : Wed Feb 23 2000 - 03:52:12 CST