Re: Third report from Ersin

From: Martin Sevior <msevior_at_gmail.com>
Date: Sun Jun 27 2010 - 15:14:58 CEST

Hi Ersin,

Thanks very much for this very detailed report.

A few things. Congratulations on the fixing the very slow typing at
the end of the 3000 page document. This is a big step forward!

There are still many bugs to fix though. Your branch crashes when it
attempts to load rtf-1.7 specification document. I'll email that you
privately as it is rather big ~270 pages of complex layout with many
very large tables. It crashes accessing with your new
getPage()->getRight()->doSomething() calls. I hope can be fixed some
simple NULL pointer checks.

The memory usage bug that prevents 2.8.x from loading csvna.txt is
actually independent of Joaquin's new PT. I fixed that separately and
committed it to the lognpt branch. It was rather a simple fix once I
realized what was happening. Hopefully we can fix the rest of the
memory and slow loading bugs too.

I'll do some more tests on your branch tomorrow. Although scrolling
doesn't redraw the current page for me, if I do selections on the
current page, text appears. Hopefully

m_xScollOffset and m_yScrollOffset are actually the difference between
the top of the document and current viewport in the case of
m_yScrollOffet, and the difference between the left-most edge and the
current viewport in the case m_xScrollOffset.

Can you hold off renaming constants and method names for the time
being? I'd like to check with you first. Maybe you could settle for
putting more nice comments in the code for what each thing does? That
would be very helpful.

Cheers

Martin

On Sat, Jun 26, 2010 at 3:37 PM, Ersin Akinci <ersin.akinci@gmail.com> wrote:
>
> Hi all
>
> For the past two weeks, I concentrated on implementing a "virtual
> canvas" for displaying pages, which is fundamental to both my proposal
> for non-linear page flows and much needed performance improvements.
> The idea is that each page has its own set of xy coords in layout
> units that define the position of its upper left hand corner.  In my
> particular implementation, I've made it so that each page also has
> four new pointers that point to whichever page is currently above,
> below, and to the left and right, and these pointers constitute four
> linked lists.
>
> The new FV_View::_draw function that I implemented then picks whatever
> page the cursor is at as a starting point and traverses along each
> linked list comparing the pages' xy coords to the xy offset of the
> clipping rectangle where the pages are displayed.  While typing on a
> given page, for instance, _draw now only traverses three or four pages
> (one/two pages actually but four times) regardless of where the focus
> is on the document.  Previously it was starting at the beginning of
> the document and traversing through each page, summing the combined xy
> offsets of the pages, and comparing those offsets to the clipping
> rectangle's coords, which meant that large documents (I've been
> working with two, one 300+ pages and the other 3000+ pages) were
> effectively unusable.
>
> They are still unusable, but now for different reasons =).
> Implementing the canvas and rewriting _draw were critical, but now
> other inefficiencies remain, the other really big one being
> FB_ColumBreaker.cpp:_breakSection(), which for example will traverse
> all 3000 pages if you press enter on the first page.  I also suspect
> that some of the utility functions related to vectors, especially
> getNthItem, may be to blame, and I will look into what I can do to fix
> that.  On that note, I've noticed that memory usage is absolutely
> terrible.  In 2.8.6, I can't even open my test files because after 2
> GB I run out of memory and swap is unbearably slow.  With my branch,
> which has Joaquin's new tree piecetable implemented, memory usage caps
> at under 1 GB, which is much better but still awful considering that
> this is a plain text file.  OpenOffice, in contrast, opens almost
> instantly and is using just under 150 MB of RAM.  I would suggest that
> I might look at how they do things, but not only is it liable to be
> totally different, I am still scarred by trying to cross compile OOo,
> from which I learned how complicated their code base is.
>
> Another big problem is that apparently my code is working for me but
> not for others!  Martin reported last night that the screen no longer
> updates when he scrolls down to the bottom of the test document I've
> been working from.  I made a video demonstrating that it works for me,
> but it's a 280 MB Theora-encoded OGV file...so...not sure whether
> YouSendIt will take files that large =).  I'd appreciate it if others
> could checkout my branch (gsoc2010lognpt) and give it a whirl.  Some
> details need to be fixed (e.g., scrolling is jagged now), but the
> basics should work.  I should mention here that printing and multiple
> page views also work with my new code.
>
> Over the next two weeks, I plan to further refine my canvas
> implementation and to look into the _breakSection and utility methods
> that are still slowing things down.  Also, I don't know how people
> would feel about this, but I would be willing to go through the source
> files that I'm working with and give everything a "fashion makeover".
> By this, I mean that a lot of really important members and attributes
> are given really strange names (e.g., the variables that describe the
> clipping rectangle's xy pos are called m_iXScrollOffset and
> m_iYScrollOffset) and groups of members that do logically similar
> functions have completely arbitrary naming schemes.  Also, whitespace
> indentation in the header files sorely needs to be cleaned up.  This
> may seem silly, but for those like me who are new to the code base, it
> makes a world of a difference, and if it isn't cleaned up it may lead
> to duplicate code in the future.  I spent at least a few days trying
> to find a way to report the clipping rectangle's xy coords until I
> happened to stumble upon m_iX/YScrollOffset by pure dumb luck.
>
> Best,
> Ersin
>
> --
> Ersin Y. Akinci -- ersinakinci.com
>
> What Digital Revolution? -- www.whatdigitalrevolution.com
> Thinking critically about digital worlds.
>
Received on Sun Jun 27 15:15:13 2010

This archive was generated by hypermail 2.1.8 : Sun Jun 27 2010 - 15:15:13 CEST