Re: Zoom/spellcheck bug.


Subject: Re: Zoom/spellcheck bug.
From: Martin Sevior (msevior@mccubbin.ph.unimelb.edu.au)
Date: Fri Oct 27 2000 - 07:59:23 CDT


On Fri, 27 Oct 2000, Thomas Fletcher wrote:

> On Fri, 27 Oct 2000, Martin Sevior wrote:
>
> > HI David,
> > Thanks for the message. I think this is caused because spell
> > check is an asynchronous processes that gets fired every 100
> > milli-seconds. I believe Shaw also reported this bug.
>
> It is an asynchronous event, but since the messaging comes
> into the main dispatch queue for the system, then it should
> still be synchronous with the operation of the system. This
> is one of the main reasons why the BeOS timers were such a
> pain to implement because of the multithreaded nature of BeOS
> graphics apps and the implied dependance on a synchronous
> process in the Abi design.
>
> >From my meager understanding of the Unix windows event model
> there is a queue of events, these events each get handled
> and dispatched in turn:
>
> Fifo Queue (bottom most recent):
>
> 5 [...]
> 4 [Keyboard Event]
> 3 [Timer Event]
> 2 [Zoom/Expose Event]
> 1 [Timer Event]
>
> So in this case the initial timer event (1) would have
> initiated the spell checking to occur. This involved
> iterating the entire document, and you shouldn't actually
> even begin to process the Zoom/Expose event until after
> the state of the system is "normal".

It's not quite that bad. Actually what happens is that each block
(paragraph) gets queued and processed in turn. You can still type and
change the zoom while the spell check continues in the background as this
series of timer events.

>
> My personal opinion is that the problem is with the
> design of the timers. The problem is that you might
> instantiate a timer object, start the timer, which
> usually involves passing a pointer to a timer object
> to the system. Then before the timer actually goes
> off the timer is deleted, but not before an event is
> generated and pushed into the queue for processing.
> By the time that the timer event gets processed, we
> the object it references is stale and gone.
>

yeah I came across exactly this problem in the Modeless
autoupdating WordCount dialog.

> ie a time line looks like
> Initialize & Start Timer, Start processing an event A,
> Timer puts an "Alarm event in the queue", Processing
> of event A causes us to delete timer, Finished event
> A, Start processing timer event, BOOM! Timer event
> invalid.
>
> A partial solution (though not very elegant) would
> be to have "timer slots" which would be filled with
> the pointer to an object. These slots are what
> are passed in to the timer functions. Then when
> a Timer object is created/delete is fills/empties
> the slots ... which remain persistant throughout
> the applications lifetime. When a timer process
> callback is invoked, it is a two level lookup to
> get the timer object handle:
>
> Get Data from Timer Slot
> if (Data Exists) {
> Do our processing
> } else {
> Ignore this event, timer was deleted
> }
>
> Problems with this: In a threaded environment you
> would have to have a lock on the timer slots. This
> isn't a problem for us since we run in this pseudo
> synchronous model (though someday threads will hugely
> help our interface response). The second problem is
> that you have to have a "wait time" associated with
> the slots so that you don't delete a timer object, then
> immediately fill its slot with a new timer object since
> there might be a timer event in the queue that should
> be ignored. The re-use time is a per-slot value
> that should be relative to the frequency of the timer.
>

Hmm I'll have to think about this.

> > The process of zooming involves rebuilding a whole document layout
> > which takes a finite time. During the rebuilding phase the document
> > is in an invalid state for spell checking. A spell check during this phase
> > may cause a segfault. The solution is to set the m_bdontSpellCheckRightNow
> > boolean variable from fv_View.cpp until the zoom is complete. If these
> > are enough clues to fix your problem quickly, give it a go otherwise let
> > me know and I'll try myself.
>
> This would be a temporary solution, but since we are using timers
> all over in several of the modeless dialogs, this could crop up
> any time that you have events in the queue waiting. The event
> service time is entirely dependant on the speed of the system
> so it is a very nasty hit/miss problem.
>

This was solved in the Unix dialogs by putting in some explicit
handshaking with the timer callbacks. You can't delete a Unix Modeless
dialog until the timer callback says its finished doing its stuff. As far
as I can tell this works 100% of the time in the Unix code.

Actually the solution might be to put this handshaking code into the
Spellchecking Timer callback too. Have a look in ap_UnixFrame.cpp at the
setzoompercentage method. Just add some code to kill spellchecking timer
with some handsaking code to verify that is dead before the call to
_showdocument(zoompercentage);

Look in ap_UnixDialog_WordCount.cpp and the destroy() method to see how I
did the handshaking to safely kill the timer there.

The spellchecking code timer code is in fl_DocLayout.cpp.

> Anyone else have any thoughts on this quandry? Thankfully all
> of this code is isolated into one class and should actually make
> a fairly good first project for someone with the interest.
>

Hmm I'm not sure if your general solution will work. How do you know if
the pointer in the slot is valid? It is not NULL. I think a better
solution is to make sure timer has returned from doing its thing before
deleting it. We might be able to put in some general handshaking code to
test this in the timer class. ie Two boolean variables.
m_bDontDoTimerCodeNow and m_bImDoingMyThing.
A callback is only executed if m_bDontDoTimerCodeNoew is False.

Then m_bImDoingMyThing is set to true before jumping to the callback.
After returning it's set to False. The timer delete method then sets
m_bDontDoTimerCodeNow to True and only removes the timer when
m_bImDoingMyThing is False.

 I think would solve the threaded problems too.

Cheers

Martin

(I knew my data acquistion experience would be useful one day :-)

> > On Fri, 27 Oct 2000, David Schmitz wrote:
> > > I've discovered an odd bug. It's been befuddling me for a couple of days
> > > and I think I've finally tracked it down. I've been taking some old, old
> > > text files and converting them to .rtf by hand; putting formatting,
> > > typefaces, etc back into the document. Since I've been doing this on and
> > > off, I've been saving, and reloadign the documents sporadically.
> > >
> > > One of the first things I do when I open up a document is to set the
> > > zoom to the proper size. (140% is perfect for my monitor, so I plopped
> > > that into the sources and did a recompile) When I choose 140% from teh
> > > drop down menu, sometimes AbiWord would crash, and other times it would
> > > not. My dander up, I decided to investigate. (My first issue was to get
> > > rid of that damned GNOME segfault dialog so that the program could dump
> > > a proper core file :-)
> > >
> > > After doing some experiementing, I noticed that the wp would die if I
> > > did my zoom before the program is done putting those red lines under
> > > misspelled words. If I do the zoom *after* the little red lines are all
> > > drawn, the program happily chugs along without skipping a beat.
> > >
> > > After causing a crash, a post mortem backtrack on the core gives me this:
> > >
> > > (gdb) backtrace
> > > #0 0x405f4d8f in ?? () from /lib/libc.so.6
> > > #1 0x80bc1f5 in _Timer_Proc ()
> > > #2 0x4043c04d in ?? () from /usr/lib/libglib-1.2.so.0
> > > #3 0x4043b186 in ?? () from /usr/lib/libglib-1.2.so.0
> > > #4 0x4043b751 in ?? () from /usr/lib/libglib-1.2.so.0
> > > #5 0x4043b8f1 in ?? () from /usr/lib/libglib-1.2.so.0
> > > #6 0x40240669 in ?? () from /usr/lib/libgtk-1.2.so.0
> > > #7 0x80a0a35 in AP_UnixGnomeApp::main ()
> > > #8 0x80a082b in main ()
> > > #9 0x4051e9cb in ?? () from /lib/libc.so.6
> > > (gdb)
> > >
> > > I get similar results after attaching to a process using ddd.
> > >
> > > And here's output from a ABI_OPT_DEBUG=1 build up until its death:
> > >
> > > [...]
> > > DEBUG: fb_LineBreaker.cpp:162 tab run: p=0x1 type=0 leader=27 height=0
> > > width=0 offset=1 length=140663096
> > > DEBUG: tabstop: unknown tab stop type [L]
> > > DEBUG: fb_LineBreaker.cpp:162 tab run: p=0x1 type=0 leader=27 height=0
> > > width=0 offset=1 length=140664728
> > > DEBUG: tabstop: unknown tab stop type [L]
> > > DEBUG: fb_LineBreaker.cpp:162 tab run: p=0x1 type=0 leader=27 height=0
> > > width=0 offset=1 length=140666400
> > > DEBUG: tabstop: unknown tab stop type [L]
> > > DEBUG: fb_LineBreaker.cpp:162 tab run: p=0x1 type=0 leader=27 height=0
> > > width=0 offset=1 length=140668032
> > > DEBUG: tabstop: unknown tab stop type [L]
> > > DEBUG: fb_LineBreaker.cpp:162 tab run: p=0x1 type=0 leader=27 height=0
> > > width=0 offset=1 length=140669752
> > > DEBUG: tabstop: unknown tab stop type [L]
> > > DEBUG: tabstop: unknown tab stop type [L]
> > > DEBUG: Insertion Point has moved before erasing
> > > DEBUG: fv_View::draw() called with zero drawing area.
> > > DEBUG: fv_View::draw() called with zero drawing area.
> > > DEBUG: fv_View::draw() called with zero drawing area.
> > > DEBUG: ut_unixTimer.cpp: timer destructor
> > > DEBUG: ut_unixTimer.cpp: timer destructor
> > > DEBUG: ut_unixTimer.cpp: timer destructor
> > >
> > > I wish I had more time in my schedule and I'd do some poking about until
> > > I found out what was going on, but I don't, so I thought I'd send this
> > > info on.
> > >
> > > --
> > > David Schmitz
> > > Please allow me to introduce myself:
> > > I'm a man of wealth and taste.
> > > --
> > > http://www.ecsd.com/~david
> > >
> > >
> > >
> >
> >
>
> -------------------------------------------------------------
> Thomas (toe-mah) Fletcher QNX Software Systems
> thomasf@qnx.com Neutrino Development Group
> (613)-591-0931 http://www.qnx.com/~thomasf
>
>
>



This archive was generated by hypermail 2b25 : Fri Oct 27 2000 - 07:59:44 CDT