Re: More Widespread Adoption of Enchant ...

From: F Wolff <friedel_at_translate.org.za>
Date: Thu Sep 15 2011 - 08:34:53 CEST

Op Wo, 2011-09-14 om 17:50 -0600 skryf Kevin Atkinson:
> On Wed, 14 Sep 2011, F Wolff wrote:
>
> >
> > Hi Kevin
> >
> > I'm mostly a bystander on this list (Abiword localiser), but do have
> > some interest in Enchant (I'm a developer for Virtaal which uses the
> > Enchant API), and maintain spell checking libraries for South African
> > languages.
>
> Thanks for you comments.
>
> A few notes below.
>
> > Some people have looked at things like foma for constructing better
> > spell checkers, and if they have success on the NLP side, obviously they
> > wouldn't want to integrate it separately with each major consumer of
> > this type of technology. Enchant works wonderfully in this way as it
> > allows a new spell checking engines to be used by many applications. Not
> > having to write support for personal word lists, etc. is a big win.
>
> And that is one of my main points and why I believe Enchant has a future.
> I think Dominic Lachowicz is selling him self short. He wrote a very
> successful piece of code. (Something which by the way I tried long ago
> and failed miserably with the Pspell/Aspell combination).
>
> > That said, if somebody wants to get Enchant/Aspell to work with any of
> > these projects, the code in the Voikko project is probably the best
> > start available.
>
> Yes, but there is also Google chrome, which doesn't have a Voikko model.

Good point.

> > About integration into Mozilla products and Chrome, we have to realise
> > that they place a humongous emphasis on performance these days, and
> > might not accept something if it represents a noticeable regression in
> > any of startup time, executable size, memory consumption or CPU time.
>
> Funny you should say that, because in many ways Aspell can actually help
> here. Especially as far as load times for large dictionaries (Aspell
> dictionaries are precompiled and are then mmap in, thus loading a
> dictionary requires essentially no work).

I was completely unaware of how Aspell works. I was thinking more in
terms of adding a layer of indirection through Enchant that might be a
concern for them, but it is interesting in this case, as I think load
time is actually one of the issues with Hunspell. See:
https://bugzilla.mozilla.org/show_bug.cgi?id=468779

> > For me as somebody doing a lot of work in non-English languages, I can
> > assure you that either of Aspell or Hunspell are great for English! In
> > that sense I struggle to see why better suggestions would be a major
> > selling point for Aspell (if that is indeed the major one), since it is
> > probably "good enough" for many people, and if you are used to spell
> > checkers fairly often missing words, the relative quality of suggestions
> > might not be appreciated.
>
> Well, you likely can spell better than I can, often I have a hard time
> figuring out how to spell a word, being able to spell a word phonetically
> is where Aspell's strength is. As far as often missing words, can you give
> me some examples offline, I am also the maintainer of the
> American and Canadian dictionaries for both Hunspell and Aspell and the
> British dictionary for Aspell (for Hunspell a different dictionary is used,
> but that might change soon, since that has not been updated in a while).

I expressed myself badly. What I meant was: in the two other non-English
languages I write in, missing words is a somewhat frequent problem with
the spell checkers, simply because the spell checkers aren't as good as
the English ones. From "our side" it seems as if the English ones
(either of Hunspell or Aspell) works really well, since it almost never
has any missing words (maybe because I'm also writing as a second
language speaker with reduced vocabulary). So while I know that
suggestion quality is an important part of making a good spell checker,
I'm not sure that suggestion quality alone will convince people who are
used to evaluating a spell checker based on the missing words.

Keep well
Friedel

--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/virtaal-070-released
Received on Thu Sep 15 08:35:10 2011

This archive was generated by hypermail 2.1.8 : Thu Sep 15 2011 - 08:35:11 CEST