RE: Summer of Code Hello

From: Leonard Rosenthol <leonardr_at_lazerware.com>
Date: Mon May 29 2006 - 21:05:13 CEST

> It would make a lot of sense to re-use poppler's
> OutputDev data structure as a building block here. I
> don't think that you can hook up its methods directly
> to AbiWord PieceTable operations, however, since you'd
> want to do some detailed analysis on the page's
> contents, and the PDF's data may appear in any order.
>
        I agree with Dom on this...

        Better to do all the work in your own data structures (whatever they may turn out to be) and then bring it into Abi - either directly to the PieceTable OR via HTML/RTF, etc.

 
> It'd be possible to model your plugin on the
> TextOutputDev (or HtmlOutputDev, if it's as good as
> Leonard claims), recording the (x,y) coordinates of
> each text run, as well as the text's font, size,
> actions, ...
>
        HtmlOutputDev already does that for you - breaking the text into runs with coordinates and style info. It is, however, based on older logic from Xpdf so there are some things that could be brought over from there as a START to improving the "coalescing" logic, esp. for columnar data. Then you start on your specific improvements to that logic.

        Finally, you can choose how to output the data structures - HTML, RTF or even direct to the PieceTable.

> One thing to watch out for, however, is that
> libpoppler no longer installs the OutputDev.h header,
> since they've declared it to be private API. Your
> use-case might get them to reconsider this
> distinction, however.
>
        Doubtful :(.

        They really want to keep as much of the internals of Xpdf away from folks, so that they (and/or Derek) can change it w/o effecting callers of Poppler. But, it's certainly a catch-22 for projects such as this.

> Alternately, they might accept an OutputDev that
> converted to (say) OpenDocument or RTF directly in
> libpoppler, and then AbiWord and KWord could jointly
> share that bit of code.
>
        That would be my recommendation!

> In a past life, Leonard was my boss at a PDF-oriented
> company. He has PDF experience coming out the whazoo,
> and his own PDF-related consulting business.
>
        Yeah, but then you got a life, moved away, and doesn't drink with us anymore ;). (speaking of which, say "Hi!" to Ruth for me)

LDR
Received on Mon May 29 21:06:31 2006

This archive was generated by hypermail 2.1.8 : Mon May 29 2006 - 21:06:32 CEST