Re: AbiSource, Ispell, Aspell and beyond...


Subject: Re: AbiSource, Ispell, Aspell and beyond...
From: Paul Rohr (paul@abisource.com)
Date: Fri Feb 25 2000 - 20:50:33 CST


At 04:54 AM 2/23/00 -0500, Kevin Atkinson wrote:
>> Aspell needs more info than a simple minded spell checker to do its job
>> as well as it can, like communicating back the replacement pairs.
>
>This is needed becuase aspell is able to learn from users
>missspellings. For example on the first pass a user misspelles
>beginning as beging so aspell suggests:
>
> begging, begin, being, Beijing, bagging, ....
>
>However the user then tries "begning" and aspell suggests
>
> beginning, beaning, begging, ...
>
>so the user selects beginning. However than, latter on in the document
>the user misspelles it as begng (NOT beging). Normally aspell will
>suggest.
>
> began, begging, begin, begun, ....
>
>However becuase it knows the user mispelled beginning as beging it will
>instead suggest:
>
> beginning, began, begging, begin, begun ...
>
>BTW I often misspelled beginning (and still do) as something close to
>begging and two many times wind up writing sentences such as "begging
>with ....".

Hmm. That's an interesting approach. I hadn't thought of feeding the
corrections chosen back into the engine to help improve the quality of
future suggestions. I guess I'm a good enough speller already that I've
never needed adaptive algorithms like that.

Your examples still make it sound a bit like black magic, though. :-)

>Not strictly Aspell related but have you considered offering the ability
>to look up words. The wordnet dictionary, although not 100% complete,
>has VERY good definitions and is very free. I have written some
>formatting code which formats wordnets definitions as returned by the
>dict server. It would also be fairly easy to use it locally by either
>A) running a dict sever in the background or B) reading the definitions
>directly from the database....
>
>If you are interested let me know as I have some lookup code in my
>gaspell applit.

I personally won't have time to look at this, but it might make an
interesting project for someone who does.

>(Yes its nasty C++ but it should be easy to uglify it
>to your portably standards. -)

Kevin, do I sound this snide to you? If so, please let me know. I really
don't want to come across as a totally grumpy guy.

>So
>the point is if you REALLY want a device dependent dictionary format you
>can provide aspell with one. However as I said before I no NOT think
>this is the right path to chose. Compiling the word list is a VERY easy
>task in aspell and moderately easy in ispell.

I think we're in violent agreement here. Having device-dependent dictionary
formats is a Bad Thing, right?

>Also, I know a bunch of you (the group of developers) have mangled
>ispell quite a bit.

You're probably overestimating how much work we've done on ispell. Mostly
we've just deleted and ANSI-fied the code to make it work at all.

>Do you think it is worth trying to extract the
>affection code from ispell so I can use it aspell or am i better off
>reimplementing it. If you think it is extractable could one of you give
>me a helping hand. I am extremely bad a deciphering nasty C code....

I don't think *any* of us really understand the existing ispell code well
enough to help you isolate specific algorithms from it. A bunch of us have
whittled around the edges on an as-needed basis, and then got out of there
as fast as we could.

Paul



This archive was generated by hypermail 2b25 : Fri Feb 25 2000 - 20:45:04 CST