Re: How to import text boxes from MS Word.

From: Tomas Frydrych (tomasfrydrych_at_yahoo.co.uk)
Date: Sat Mar 20 2004 - 11:35:34 EST

  • Next message: msevior_at_physics.unimelb.edu.au: "Re: How to import text boxes from MS Word."

    Hi Martin,

    The basic issue with the footnotes is that they are not stored inside
    the main document as we do in AW, but reside in a separate sub-
    document. So, inside the _docProc() we load the positioning
    information about all of the footnotes (i.e., the document position
    at which the note references are located) -- this is done by the
    _handleNotes() function.

    Later on, everytime we read a bit of the document in, we check the
    new document position (in the main document) agains the positions of
    the notes we retrieved earlier; if we find a note the position of
    which matches the current document position, we insert it -- this is
    handled by _handleNotesText() function.

    The code for the textboxes should be similar to that for the notes,
    although the docs are, as often, very confusing. The text box is a
    special case of office art object, so we will have to retrieve info
    for all art objects and ignore those that are not boxes.

    We retrieve the FSPA structs from the plcspa as we do for the notes,
    except there are two separate plcspa streams, one for the main data
    (plcspaMom) and another for hdr/ftrs (plcspaHdr). I am not sure
    whether we will need two separate tables (like m_pFootnotes) for
    these, or whether we could hold all the data in the same table,
    probably the former.

    Once we have the FSPAs, we have to translate the FSPA into the actual
    art shape data using the dggInfo table into which FSPA.spid is
    somekind of an index -- the problem is that the format of the drawing
    data (stored in dggInfo) does not seem to be described in the docs I
    have.

    Once we have the shape data, we should be able to get from it TXTID
    from which we can isolate the index n into the plctxbxs: plctxbxs[n]
    will give us the offset of the text for our text box in the textbox
    subdocument; plctxbxs[n+1] holds the postion immedately after the
    last of our text, i.e., textlength = plctxbxs[n+1]-plctxbxs[n].

    The real snag is the the translation of FSPA into the drawing shape
    data; I think you will need to examine the OO importer to find out
    what the format of this data is. The rest is identical to the notes
    handling.

    Tomas
    >
    > Hi Tomas,
    > I would really like to get Text Boxes imported from MS Word.
    > From reading the docs and scanning the wv code it appears that
    > the process is very similar to that for importing
    > footnotes/endnotes. There is a seperate set of tables that hold
    > the text outside the main stream of the document flow. It alos
    > appears that wv can recognize them and makes them available.
    >
    > However I don't understand your code that does the footnote/ednote
    > imports. I think that importing text boxes will be very similar,
    > especially since the RTF import of text boxes is a pretty good match to
    > our piecetable - much like footnotes/endnotes is.
    >
    > Anyway, any help you can give me to get text boxes imported from MS Word
    > would be most appreciated :-)
    >
    > Cheers
    >
    > Martin



    This archive was generated by hypermail 2.1.4 : Sat Mar 20 2004 - 11:38:43 EST