Re: The AbiWord side of a grammar checker (was Re: Implementing support for barbarisms correction)

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Sun Sep 22 2002 - 21:45:17 EDT

  • Next message: Andrew Dunbar: "Re: Implementing support for barbarisms correction"

     --- Martin Sevior <msevior@physics.unimelb.edu.au>
    wrote:
    >
    >
    > On Sun, 22 Sep 2002, Dom Lachowicz wrote:
    >
    > > Implementing a vector of incorrect phrases and
    > > squiggling them is trivial. Determining if a given
    > > phrase is correct/incorrect and why is so is the
    > > hard part in front of us :) Oh, that and our code
    > > to separate things on phrase/sentence boundaries
    is
    > > non-existent. This'll be non trivial too. Or the
    > > grammar checking tool will have to determine this
    > > for us. Either way, it's not fun.
    >
    > That is why we can break the problem into two parts.
    > What abiword needs to do and what the grammar
    > checker needs to do.

    Absolutely!

    > I'm definately not volenteering for the latter just
    > the former. I don't think it would be too hard to
    > split text in sentences, just look for full stops
    > :-) If this is insufficient we can send out the
    > entire paragraph of text, which is self-contained by
    > default.

    Well how would we send it out? If we send it just as
    a string of text how will we pass the start and end
    positions of squiggles?
    If we pass it in internal format it'll be yucky for
    grammar types to hack on. We'll also have to watch
    out for inline level attributes such as bold and
    italic
    which we don't want to bother the grammar checker
    with.

    > If we put in the infrastructure to draw green
    > squiggles and hooks for grammar checking plugins we
    > can punt the hard part of the problem to dedicated
    > programs.

    Yes let's do it as OO as possible.

    > Alan Horkin has been hunting these down for us.

    Keep hunting Alan (:

    Andrew.

    > Cheers
    >
    > Martin
    >
    > >
    > > Just keeping things in perspective,
    > > Dom
    > >
    > > On Sunday, September 22, 2002, at 02:07 AM, Martin
    > Sevior wrote:
    > >
    > > > Sun, 22 Sep 2002, [iso-8859-1] Andrew Dunbar
    > wrote:
    > > >>
    > > >> What we probably need to do is start designing
    > a
    > > >> grammar checker framework, complete with a
    > plugin
    > > >> interface for extensions, and design the
    > barbarism
    > > >> checker as a plugin for it.
    > > >>
    > > >
    > > > I've discovered that I personally definately
    > need a grammar checker so
    > > > I'm
    > > > happy to help out though not take the lead on a
    > grammar checker.
    > > >
    > > > There are two components. The "squiggling"
    > implementation and the
    > > > actually
    > > > parsing of text.
    > > >
    > > > Regarding the squiggling, we can borrow much of
    > the design from the
    > > > spell-checker.
    > > >
    > > > To remind people this works by building a vector
    > of pointers to
    > > > fl_BlockLayout classes then processing these
    > during idle time in the
    > > > GUI
    > > > mainloop.
    > > >
    > > > The fl_BlockLayout classes container pointer to
    > text in the piecetable
    > > > which is seperated by white space characters
    > into words. These words
    > > > are
    > > > fed through the spell checker.
    > > >
    > > > A grammar check would do exactly the same except
    > it would have to
    > > > recognize sentences and parse these through to
    > the grammar checker.
    > > >
    > > > I think we can reuse much of the spell checker
    > code so that
    > > > fl_BlockLayouts are parsed through to both the
    > spell checker and the
    > > > grammar checker.
    > > >
    > > > If a region of the text is found to be suspect
    > the text is marked with
    > > > a
    > > > green squiggle two pixels below the red squggle.
    > > >
    > > > Hmm the more I think about this, the easier it
    > seems. We can re-use a
    > > > lot
    > > > of the existing classes and methods and just add
    > extra code to split
    > > > the text into sentences as well as words.
    > > >
    > > > The grammar checker would have to mark the start
    > and end points of the
    > > > dodgy text and send this info back. Then we
    > reuse the squiggle code to
    > > > draw between the points.
    > > >
    > > > I think this would not be hard to get working
    > rather quickly.
    > > >
    > > > see the code in the file fl_BlockLayout.cpp
    > > >
    > > > Cheers!
    > > >
    > > > Martin
    > > >
    > >
    > >
    >

    =====
    http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Sun Sep 22 2002 - 21:49:49 EDT