Re: Implementing support for barbarisms correction

From: Dom Lachowicz (doml@appligent.com)
Date: Sat Sep 21 2002 - 14:15:45 EDT

  • Next message: Dom Lachowicz: "Re: Implementing support for barbarisms correction"

    Hi Jordi,

    On Sat, 2002-09-21 at 13:16, Jordi Mas wrote:

    > In the other side, barbarisms are different. They are just wrong words. If you
    > already have a word in your language to express a concept and you use an
    > incorrect one that is a barbarism.

    I'm wondering about languages like French, which will come up with a
    replacement for the word "Computer" in its own tongue. How do we best
    handle that? Are all of these words now "barbarisms?" Even though
    they're in common use, *everyone* uses them, and they've been used for a
    decade now? How will Abi best handle that?

    > Well, the most popular commercial Catalan spell checker (WordCorrect) has a
    > barbarism correction feature, that's why I tough that it would be cool to have
    > it in Abi also.

    This would be a cool feature. Don't get me wrong - I'm not against this
    feature existing, because you have shown its usefulness. I'm mostly
    against your original proposed implementation or, more correctly, how
    and where to implement this feature. As you know well, implementations
    of features are different than features.
     
    > I personally believe that originally spell checkers were designed to fix
    > misspelled words as you say, but with the time, they have become a tool that
    > helps people to make less mistakes when they write, that the reason why Word
    > for example, include some kind of basic grammatical correction.

    Yes. But spell checkers should only check spellings. Grammar checkers
    check grammar. In general, these are all generally "proofing" tools
    which aide in producing more correct documents. Your "barbarism"
    suggestion seems to be more of a grammar issue than a spelling one.

    > Dom, there are not many barbarisms. We are talking usually about 100-200
    > words. We are not talking to re-implement a full spellcheking system. A
    > simple list will be enough. It's true that you will have two entries if the
    > word is plural and only some forms of verbs are barbarms.

    I still unconvinced that a simple list will be enough, though it might
    be. If we do this, I'd ideally like the system to be smart and dynamic
    enough to handle future considerations. My French example above would
    easily have over 200 words (probably more like several thousand) if the
    old words are considered barbarisms.
     
    > Well, you can have a custom.dic that has one word per line, or two words with
    > a special separation character that indicates misspelled word, suggested word.

    I was only saying that our current custom.dic (or separating the
    custom.dic into separate languages, which we really need to do) couldn't
    possibly handle this situation. It could be extended to handle this
    situation.
     
    > Dom, since you are the maintainer if you say "I don't want this in
    > the main tree." I interpret this as "that's the end of the conversation, keep
    > this feature out the main cvs".

    I should have been more specific. I didn't want your proposed
    *implementation* in main cvs. You can try to convince me that I am wrong
    or can come up with alternate suggestions, or both. Below, I have 2
    suggestions that would fulfill our needs, as I see them, though they may
    be a bit more (or less!) work than your original suggestion. I am still
    convinced that I am not wrong.

    > If you still think that we should use a plugin, can you describe a bit more
    > how do you think that I should work?

    If this is a spelling problem, I'd like to see such a feature moved into
    ispell or aspell. More applications than just AbiWord could take
    advantage of this feature, and it would be solving the problem on the
    correct level, in my opinion. If this problem is "outside of the scope
    of ispell or aspell" as you suggest, why is your counter-example another
    spell checker? Why would you want to include mix this with our
    spell-checking code and store the results in a dictionary?

    This begs the question: couldn't this be part of another tool or
    library? I certainly wouldn't object if we did the following in our main
    tree:

    #include <libbarbarism.h>
    if (isBarbarism(word, language))
            addWordToBarabarismList(word, language);

    and when right clicking such a word:
    menu->addSuggestion (barbarismSuggest(word, language));

    and then we could draw maybe a blue squiggle under the word (red for
    spellcheck, blue for barbarisms, green for grammar, ...). Here we're
    maintaining a list of barbarisms separate from misspelled words. This is
    an important semantical difference, not just me being pedantic. A large
    part of our code could then be used by other people, like OpenOffice,
    KOffice, Evolution, ...

    If this problem is a different proofing issue unto itself, you could
    reuse the spelling dialog to create a new tool, and install that tool
    into the "tools" menu. My gross assumption to make this an easy problem
    is that barbarisms would naturally be misspelled words. So by
    overloading a virtual method, you could see if the next misspelled word
    is a barbarism. You could then make a suggestion from a word list and
    put it into the dialog. The user could then "change all" "ignore all"
    "ignore just this one" "add", etc... Note that this is *not* the same as
    treating the barbarism as a misspelled word as in your original
    suggestion.

    When you say "we'll see if this is not useful, and can move it into a
    plugin" it kind of hurts me. Plugins are very useful things. I'd hardly
    call our JPEG and ImageMagick image importers not-useful. I wouldn't
    call our filters not-useful. I wouldn't call the thesaurus or
    translation tools not-useful. They are all very useful. There is a huge
    difference between being a plugin and not being useful at all. Some
    features are best implemented inside of Abi. Some features best live as
    add-ons. Sometimes the line is blurry.

    > I appreciate your comments Dom,

    And I yours, Jordi. Thanks for not getting (too) discouraged.

    Cheers,
    Dom



    This archive was generated by hypermail 2.1.4 : Sat Sep 21 2002 - 14:19:15 EDT