Re: Summer of Code Hello

From: Dom Lachowicz <domlachowicz_at_yahoo.com>
Date: Mon May 29 2006 - 19:21:57 CEST

Hi Jauco,

> I was thinking of maybe writing the plugin as it's
> own outputdev that
> would add data to the piecetable through the
> outputdev hook functions
> and then work it's magic on the piecetable itself,
> but I'm not sure
> yet.

It would make a lot of sense to re-use poppler's
OutputDev data structure as a building block here. I
don't think that you can hook up its methods directly
to AbiWord PieceTable operations, however, since you'd
want to do some detailed analysis on the page's
contents, and the PDF's data may appear in any order.

AbiWord, however, expects a logical ordering, and you
won't know how to (eg.) break text into columns, etc.
until all of the text has been drawn.

It'd be possible to model your plugin on the
TextOutputDev (or HtmlOutputDev, if it's as good as
Leonard claims), recording the (x,y) coordinates of
each text run, as well as the text's font, size,
actions, ...

One thing to watch out for, however, is that
libpoppler no longer installs the OutputDev.h header,
since they've declared it to be private API. Your
use-case might get them to reconsider this
distinction, however.

Alternately, they might accept an OutputDev that
converted to (say) OpenDocument or RTF directly in
libpoppler, and then AbiWord and KWord could jointly
share that bit of code.

> I'm always interested to deliver you from any excess
> experience :P

In a past life, Leonard was my boss at a PDF-oriented
company. He has PDF experience coming out the whazoo,
and his own PDF-related consulting business.

Best,
Dom

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Received on Mon May 29 20:38:30 2006

This archive was generated by hypermail 2.1.8 : Mon May 29 2006 - 20:38:32 CEST