Re: Barbarism implementation proposal

From: Dom Lachowicz (doml@appligent.com)
Date: Tue Sep 24 2002 - 16:43:00 EDT

  • Next message: Alistair Vining: "RE: Barbarism implementation proposal"

    On Tuesday, September 24, 2002, at 04:26 PM, Jordi Mas wrote:

    > <Barbarism
    > word="tamany"
    > suggestion1="mida"
    > suggestion2="grandària"
    > />

    For what it's worth, Jordi and I have reached a good consensus on this.
    I've made a suggestion to him regarding the XML grammar. It should look
    something like this instead:

    <barbarism word="tamany">
            <suggestion word="mida" />
            <suggestion word="grandària" />
            <suggestion ... />
    </barbarism>

    Doing this will allow for a easily growable suggestion list and simpler
    import logic.

    > * Known problems in the design
    >
    > - We work at word level, not sentence level. We are just hacking a
    > spell checker

    I'll work on an interface and implementation for this which we can use
    later. It will necessarily resemble:

    Iterator Document::getParagraphIterator()

    Iterator ParagraphIterator::getSentenceIterator()
    string ParagraphIterator::getTarget()

    Iterator SentenceIterator::getWordIterator()
    string WordIterator::getTarget()

    Once we have a reasonably working sentence iterator, we can start
    hooking up grammar checkers. Once we have a sentence iterator, we'll
    have a word iterator that might help clean up the massive amount of
    garbage in our current spelling queuing code.

    > - Words that can be declined have to be coded several times (plurals,
    > verbs declinations, etc). At least in Catalan, this is not very > common.

    On a related note, we may want to implement a multimap (1->Many)
    structure to use here for efficiency concerns. We could probably get
    away with using a UT_Map or UT_StringPtrMap here. The target would be a
    UT_Vector containing UT_UTF8String pointers.

    string barbarism -> string suggestion1, string suggestion2, ...

    Cheers,
    Dom



    This archive was generated by hypermail 2.1.4 : Tue Sep 24 2002 - 16:47:56 EDT