[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice
Subject:    Fulfill you request (Filters)
From:       Werner Trobin <wtrobin () carinthia ! com>
Date:       2000-01-24 21:19:10
[Download RAW message or body]

Dies ist eine mehrteilige Nachricht im MIME-Format.

Hi!

You said you want is - so you get it. The reason that I don't like
sending that to the list is, that it's really long...

Please don't flame me for that :)

Werner

KOFilter template mail :)
#############################################################################

As the filters are a separate library you won't learn much about
KDE programming (only a little bit of Qt - the QTL - the replacement
for the STL). If you still want to help me just read on :)

I don't know if you already use KDE 2.0 - anyway, I'll give you a
short install guide:

I'd like to start by explaining the basic rules for installing KDE 2
(parallel to KDE-1.x.x!) because KOffice needs the new KDE 2.
Note: This is one possibility of hundreds and works for me on my SuSE
Linux 6.1 (should work on most Distros, though):
- Create a new user (it's cleaner that way). Use this user for all the
  following stuff - it will be your KDE 2 user. As you don't change the
  stuff for all the other users they will use KDE-1.x.x
- Download a current Qt-2.1pre snapshot (somewhere on ftp://ftp.troll.no).
  To update it from time to time you might want to install rsync
  (http://rsync.samba.org). To get the newest Qt without downloading the
  (quite large) snapshots, just type:
  rsync -a -v -z rsync.troll.no::qt /your/local/qt-2.1/path
- Set QTDIR, PATH, LD_LIBRARY_PATH,... in the '.bashrc' of your KDE 2
  user (Simply add these lines and adjust the path (i.e. replace 'foo')):

  ### KDE 2 stuff ###########################################

  export WINDOWMANAGER=/home/foo/kde/bin/startkde
  export KDEDIR=/home/foo/kde
  export QTDIR=/home/foo/qt
  export LD_LIBRARY_PATH=/home/foo/qt/lib:/home/foo/kde/lib
  export LD_RUN_PATH=/home/foo/qt/lib:/home/foo/kde/lib
  export PATH=/home/foo/kde/bin:/home/foo/qt/bin:$PATH

  ###########################################################

- Compile Qt and make sure that this Qt replaces your standard Qt-1.4x.
  This is esp. important because of the new moc version! If the following
  stuff doesn't work, please check if you use the correct moc.
- You may have to download & install autoconf-2.13 and automake-1.4.
  (Should be standard, but if you need it, it's on ftp://ftp.gnu.org/auto*)
- Get at least the packages kdesupport, kdelibs, and koffice maybe
  you'd like to use kdebase, too. I suggest using CVSup to get the
  snapshots (http://www.kde.org/cvsup.html), but you may download them
  from the ftp server, too. This is easier, but if you want to update the
  installation it takes longer as you have to download everything all
  the time. (ftp://ftp.kde.org/pub/kde/unstable/CVS/snapshots/current/)
- Compile and install these packages in the correct order (support, libs,
  base, office). Note: I have to use './configure --with-qt-libraries=
  /my/local/qt/path' to make that link corectly!
- You might have to edit your startx script and do some additional PATH
  stuff depending on your distro... (i.e. make sure that the 'startkde'
  script from $KDEDIR/bin is used)
- Have fun with KDE-2.0pre Alpha. Note: You have to logout with Ctrl+Alt+
  Backspace as this is broken at the moment...

So...now that your KDE-2.0pre Alpha is up and running we can have a look
at the filter stuff. If you are interested in the technical background,
please have a look at the koffice/filters/HOWTO file, browse through the
sources or ask me.

The WinWord filter (koffice/filters/olefilters/winword) is in a very
early stage, but if I have some time in the next weeks it should be able
to import simple formatted text (i.e. bold, italic, chapters, headlines,
tables,...). If you need information on the structure, just ask me.

The main problem is that it's quite difficult to get (correct)
documentation. Have a look at http://msdn.microsoft.com/ - AFAIK there is
some info. If you can't find it there http://www.wotsit.org might help.

There is also some "inspiration" out there (the wv-library from Caolan
McNamara, http://skynet.csn.ul.ie/~caolan)

The Excel filter is developed by Percy Leonhardt (percy@linuxfreak.com).
This filter is quite usable because Excel uses simple records to store
the information. Therefore you just leave the unknown records out and
decode the important ones. IMHO it's very easy to join Percy because
the records are quite independant so more people can work on that at
the same time.

The Powerpoint filter is just a template at the moment, but I think
this filter is not *that* important right now.

A very important non M$ filter is RTF import/export. Maybe you want
to start developing that. Note: IIRC someone already started a RTF
import filter - please ask on the KOffice mailinglist!

If you have some code you want to check in, questions, suggestions,
flames,... feel free to mail.

Have fun,
Werner
######################################################################
["kword.xml.spec" (text/plain)]

-------------------------------------------------------------------------------
-                                                                             -
-  KWord - XML File Format Description v0.0.3                                 -
-                                                                             -
-  by Werner, wtrobin@carinthia.com - last changes: 19.09.99                  -
-  Please be so kind to report all the errors, typos,... you'll surely find!  -
-  Beware: The KWord-format is moving quite fast these days (thanks to        -
-          Reggie for his great work) so some information might be outdated!  -
-------------------------------------------------------------------------------


The KWord file format is a (more or less :) human-readable XML format. This 
means it consists of HTML-like tags which define the document structure and
the contents. The main structure of each KWord document is a header and a body.
In the header-part things like the paper size, the author,... are stored.

I'd like to start with an example to explain the contents of the body. You
might have noticed that you can do nearly everything with the frames in your
KWord document (e.g. move them around, interlock them, let your text "flow"
through them,...). To achieve this flexibility KWord has to store the data in
a very well defined structure including framesets, frames, paragraphs, and so
on.

As you might remember you were able to choose between templates for "DTPing"
and simple "Wordprocessing" right in the beginning (after launching the killer-
app). The only difference is that KWord offers some help in managing the layout
of the first frame in Wordprocessing-mode (in most cases you'll only have one).
Due to that fact "DTPing" offers more flexibility and "Wordprocessing" works
almost automagically :)


Some basic notes:
-----------------

- All kinds of numbers are stored like this: foo="1" (between " and " :)
- A rational number looks like this: width="1.03" (note: the '.' is used as
  "comma")
- <XYZ foo="100" bar="0"/> equals <XYZ foo="100" bar="0"></XYZ>
- Unicode-letters (UTF-8 compressed) are used to store the text
- Some special characters ('<', '>') are "escaped" ('&lt;', '&gt;')
- Please launch Kword and save an empty file - it is much easier to follow this
  documentation if you wade trough an (almost empty) examle document :)


The tags:
---------

<?xml version="1.0" encoding="UTF-8"?>   Each file starts with this tag.
                             Note: You must not "close" this tag
                             (i.e. don't put a </?xml...> at the end of the
                             file!)


<DOC>, </DOC>                Like <HTML></HTML>. It opens/closes the whole doc.
                             Therefore each of them is only used once.

   Modifiers for <DOC>:
       author="Your Name"           Shouldn't be a problem :)
       email="you@home..."          Should neither be
       editor="KWord"               Name of the editor which has saved the file
       mime="application/x-kword"   Mimetype of the file (i.e. which app to
                                    launch if the you click on the icon
                                    representing one of your documents)


<PAPER>, </PAPER>            Is used to define the properties of the paper.
                             Normally this is the first tag in the "header".

   Modifiers for <PAPER>:
       format="1"                   0...DIN A3
                                    1...DIN A4
                                    2...DIN A5
                                    3...US LETTER
                                    4...US LEGAL
                                    5...SCREEN (screen sized)
                                    6...CUSTOM (just enter your prefered size)
                                    7...DIN B5
                                    8...US EXECUTIVE
       ptWidth="595"                Width of the page in pt
       ptHeight="841"               Height of the page in pt
       mmWidth ="210"               Same in mm
       mmHeight="297"               Same in mm
       inchWidth ="8.26772"         Same in inch
       inchHeight="11.6929"         Same in inch
       orientation="0"              0...Portrait
                                    1...Landscape
       columns="1"                  Number of columns
       ptColumnspc="3"              Spacing between columns in pt
       mmColumnspc="1.05833"        Same in mm
       inchColumnspc="0.0416667"    Same in inch
       hType="0"                    0...On all pages (even/odd) the same
                                        headers
                                    1...Different header only on first page
                                    2...Different headers for even/odd pages
       fType="0"                    See hType, header -> footer :)
       ptHeadBody="9"               Distance between header and body in pt
       ptFootBody="9"               Distance betwenn footer and body in pt
       mmHeadBody="3.5"             Same in mm
       mmFootBody="3.5"             Same in mm
       inchHeadBody="0.137795"      Same in inch
       inchFootBody="0.137795"      Same in inch


<PAPERBORDERS>, </PA...>     Used to specify the borders of the <PAPER>. Should
                             only be used within <PAPER> and </PAPER>!

   Modifiers for <PA...>:
       mmLeft="0"                   This
       mmTop="0"                    should
       mmRight="0"                  be
       mmBottom="0"                 quite
       ptLeft="0"                   self
       ptTop="0"                    explanatory :)
       ptRight="0"
       ptBottom="0"
       inchLeft="0"
       inchTop="0"
       inchRight="0"
       inchBottom="0"


<ATTRIBUTES>, </ATT...>      Some basic settings

   Modifiers for <ATTRIBUTES>:
       processing="1"               0..."Normal" document (Wordprocessing)
                                    1...DTP-document (DTPing)
       standardpage="1"             There can be only "1" :)
       hasHeader="0"                Is there a header? (0/1)
       hasFooter="0"                Is there a footer? (0/1)
       unit="mm"                    Basic unit for positioning, ruler,...


<FOOTNOTEMGR>, </FOO...>     Information for the Footnote-Manager

   Modifiers for <FOOTNOTEMGR>:
       none


<START value="1"/>           This tag stores the value of the first footnote
                             (e.g. "1" means that the first footnote looks
                             like that: [1])

   Modifiers for <START>:
       value="1"                    explained above


<FORMAT>, </FORMAT>          Used to store the formatting options for the
                             footnote. Note: This one must not be used outside
                             the <FOOTNOTEMGR> tags!

   Modifiers for <FORMAT>:
       superscript="1"              [???]
       type="1"                     [???]


<FIRSTPARAG>, </FIRSTPARAG> 

   Modifiers for <FIRSTPARAG>:
       ref="(null)"                 The name of the corresponding paragraph.


<FRAMESETS>, </FRAMESETS>    With this tag you open/close the "frame-section".
                             All your FRAMESETs (notice the small s!) are
                             placed inside (and nowhere else!)
  
   Modifiers for <FRAMESETS>:
       none


<FRAMESET>, </FRAMESET>      This tag defines one frameset. A frameset consists
                             of (at least) one FRAME and one PARAGRAPH.

   Modifiers for <FAMESET>:
       frameType="1"                0...Base frame (for internal use only!!!)
                                    1...Text frame
                                    2...Picture frame
                                    3...Part frame (e.g. KImage-Part)
       autoCreateNewFrame="1"       Whether KWord should create a new frame if
                                    there is no space left in the old one. 0/1
       frameInfo="0"                0...Body
                                    1...First header
                                    2...Odd header
                                    3...Even header
                                    4...First footer
                                    5...Odd footer
                                    6...Even footer
       grpMgr="grpmgr_0"            The name of the group manager for this
                                    table. (i.e. If this frameset "belongs" to
                                    a table the position and the size are
                                    contolled by a group manager (one for each
                                    table))
       row="0"                      Position in the table (only for "table-
                                    frames"). Index starts at 0.
       col="1"                      Just guess :)
       removeable="0"               Whether the header-frame is removable or
                                    not (notice the typo!). 0/1 [???]


<FRAME>, </FRAME>            Describes the position, property,... of one FRAME.
                             Note: The <FRAME> tag is used like this:
                             <FRAME [modifiers] /> i.e. there are no other tags
                             in between...

   Modifiers for <FRAME>:
       left="28"                    Those four modifiers (left, top, right,
       top="42"                     bottom) describe the size and the position
       right="566"                  of the frame (absolut to the paper).
       bottom="798"                 Note: measured in pt!
       runaround="1"                0...Don't run around frame
                                    1...Run around bounding rectangle
                                    2...Run around contur
       runaGapPT="2"                Run around with gap (in pt)
       runaGapMM="1"                Same in mm
       runaGapINCH="0.0393701"      Same in inch
       lWidth="1"                   Note: Description for all borders (xWidth,
       lRed="255"                   xRed, xGreen, xBlue,...) 
       lGreen="255"
       lBlue="255"                  xWidth...Width of border (pt)
       lStyle="0"                   xRed, xGreen, xBlue...RGB triplet -> color
       rWidth="1"                   of border (e.g. 255, 255, 255 -> white)
       rRed="255"                   xStyle...Style of the border-line:
       rGreen="255"                                        0...Solid
       rBlue="255"                                         1...Dash
       rStyle="0"                                          2...Dot
       tWidth="1"                                          3...Dash-Dot
       tRed="255"                                          4...Dash-Dot-Dot
       tGreen="255"
       tBlue="255"                  x==l -> left border
       tStyle="0"                   x==r -> right border
       bWidth="1"                   x==t -> top border
       bRed="255"                   x==b -> bottom border
       bGreen="255"
       bBlue="255"
       bStyle="0"
       bkRed="255"                  RGB-triplet of background color
       bkGreen="255"
       bkBlue="255"
       bleftpt="0"                  Distance: left border - text/picture in pt
       bleftmm="0"                  Same in mm
       bleftinch="0"                Same in inch
       brightpt="0"                 Distance: right border - text/pictute in pt
       brightmm="0"                 Same in mm
       brightinch="0"               Same in inch
       btoppt="0"                   Distance: top border - text/picture in pt
       btopmm="0"                   Same in mm
       btopinch="0"                 Same in inch
       bbottompt="0"                Distance: bottom border - text/pic. in pt
       bbottommm="0"                Same in mm
       bbottominch="0"              Same in inch


<PARAGRAPH>, </PARAGRAPH>    All the information for each paragraph (text,
                             color(s), format(s),...) is stored between these
                             two tags. Each FRAMESET may contain as many
                             PARAGRAPH tags as you want to.

   Modifiers for <PARAGRAPH>:
       none


<TEXT>, </TEXT>              Just guess :) Currently the text is stroed as 
                             UTF-8 compressed Unicode glyphs.
                             Note: the format-tags navigate in the text using
                             an index which starts at 0 and runs up till it
                             reaches length-1. Also the length of the text is
                             used to express format-information.

   Modifiers for <TEXT>:
       none


<NAME>, </NAME>              The name of the paragraph only used if it is a
                             footnote (see <INFO>).

   Modifiers for <NAME>:
       name="Footnote/Endnote_1"    Just the name of the paragraph where the
                                    footnote belongs to


<INFO>, </INFO>              Some paragraph information (will be extended?)

   Modifiers for <INFO>:
       info="0"                     0...No "special" information
                                    1...Footnote (see <NAME>)


<HARDBRK>, </HARDBRK>        Normally the text "flows" trough all the frames.
                             Sometimes you want to define a hard break to the
                             next frame (page); i.e. this and the following
                             paragraphs start in a new frame, even if there is
                             enough space in the last one.

   Modifiers for <HARDBRK>:
       frame="0"                    0...let it flow
                                    1...hard brake -> next frame


<FORMATS>, </FORMATS>        The text is stored plain. All the formatting is
                             configured between these two tags.

   Modifiers for <FORMATS>:
       none


<FORMAT>, </FORMAT>          These tags describe "runs of text" which share the
                             same fromatting properties.

   Modifiers for <FORMAT>:
       id="1"                       0...none (mustn't be in a file)
                                    1..."normal" text
                                    2...a picture
                                    3...tabulator
                                    4...a variable
                                    5...a footnote
       pos="0"                      position in the <TEXT>blabla</TEXT> stream
                                    (of course 0-based :)
       len="5"                      length of the "run of text" which is
                                    formatted using this format (Note: is not
                                    stored for picture-fromats (id="2")
   
   <FORMAT id="1" ...>              a typical text FORMAT
       <COLOR red="0" green="0" blue="0"/>     used to store the text-color
       <FONT name="times"/>                    guess :)
       <SIZE value="12"/>                      once again...
       <WEIGTH value="50"/>                    50 - normal, 75 - bold,...
       <UNDERLINE value="0"/>                  guess :) 0/1
       <VERTALIGN value="0"/>                  0...normal
                                               1...sub
                                               2...super

   <FORMAT id="2"...>               Note: KWord stores only a link to the real
                                    picture! (This should/will change as soon
                                    as we use the new storage spec and the new
                                    Image Container class)
                                    Note: KWord stores a 0-char at the position
                                    of the image!
       <FILENAME value="/home/..."/>

   <FORMAT id="3"...>               Note: KWord stores a 0-char at this pos.!
                                    Note: There is no additional info.

   <FORMAT id="4"...>               a variable (e.g. page number, date,...)
                                    Note: KWord stores a 0-char at this pos.!
       
       <TYPE type="0"/>                        0...date (fix)
                                               1...date (variable)
                                               2...time (fix) 
                                               3...time (variable)
                                               4...page number         
       <POS frameSet="1" frame="1" pageNum="1"/> 
                                               location of the variable
                                               (should be easy to "decode")
       <DATE year="1999" month="8" day="22" fix="1"/>
                                               just guess :)
       <FRMAT>                                 this is like the id="1" part
                                               for formatting the variable's
                                               text
       <COLOR red="0" green="0" blue="0"/>
       <FONT name="times"/>
       <SIZE value="12"/>
       <WEIGHT value="50"/>
       <ITALIC value="0"/>
       <UNDERLINE value="0"/>
       <VERTALIGN value="0"/>
       </FRMAT>

   <FORMAT id="5"...>               a footnote. Note: Once again a 0-char is
                                    stored at this position.
       <INTERNAL>                              [???]
       <PART from="1" to="-1" space="-"/>
       </INTERNAL>
       <RANGE start="1" end="1"/>              [???]
       <TEXT before="[ " after=" ]"/>          guess
                                    
       <DESCRIPT ref="Footnote/Endnote_1"/>    name (reference)
       <FRMAT>                                 to describe the formatting
       <COLOR red="0" green="0" blue="0"/>
       <FONT name="times"/>
       <SIZE value="12"/>
       <WEIGHT value="50"/>
       <ITALIC value="0"/>
       <UNDERLINE value="0"/>
       <VERTALIGN value="2"/>
       </FRMAT>

<LAYOUT>, </LAYOUT>          KWord supports "Styles" and it stores them at the
                             end of the file (STYLES). To use them for each
                             paragraph it has to store them there, too. The
                             reason for that is that you can change the style
                             of one single paragraph without changing the whole
                             document's style the new paragraph-style is based
                             on.
   Modifiers for <LAYOUT>:
       none

<NAME>, </NAME>              Name of the Style this paragraph style is based
                             on

   Modifiers for <NAME>:
       value="Standard"             The name :)

<FOLLOWING>, </FOLLOWING>    Name of the style of the following paragraph

   Modifiers for <FOLLOWING>:
       name="Standard"              Once again - the name

<FLOW>, </FLOW>              The alignment of this paragraph.

   Modifiers for <FLOW>:
       value="0"                    0...left
                                    1...right
                                    2...center
                                    3...block

<OHEAD>, </OHEAD>            Distance to the last paragraph

   Modifiers for <OHEAD>:
       pt="0"
       mm="0"
       inch="0"

<OFOOT>, </OFOOT>            Distance to the next paragraph

   Modifiers for <OFOOT>:
       pt="0"
       mm="0"
       inch="0"

<IFIRST pt="0" mm="0" inch="0"/>         Indent of the first Line
<ILEFT pt="0" mm="0" inch="0"/>          Left indent
<LINESPACE pt="0" mm="0" inch="0"/>      Spacing between the lines of the p.

<COUNTER>, </COUNTER>        Describes the counter-type used for this p.

   Modifiers for <COUNTER>:
       type="0"                     0...none
                                    1...arabic numbers (e.g. "1")
                                    2...low letter (e.g. "a")
                                    3...captial letter (e.g. "A")
                                    4...low letters roman number (e.g. "ii")
                                    5...cap. letters roman number (e.g. "IX")
                                    6...bullet (e.g. "-")
       depth="0"                    0..."1", 1..."1.1" 2..."1.1.1",...
       bullet="176"                 bullet character
       start="1"                    value to start with
       numberingtype="1"            0...list
                                    1...chapter (added to doc-structure)
       lefttext=""                  special text left to counter
       righttext=""                 special text right to counter (e.g. ".")
       bulletfont="times"           font for bullet-char

<LEFTBORDER red="255" green="255" blue="255" style="0" width="0"/>
<RIGHTBORDER red="255" green="255" blue="255" style="0" width="0"/>
<TOPBORDER red="255" green="255" blue="255" style="0" width="0"/>
<BOTTOMBORDER red="255" green="255" blue="255" style="0" width="0"/>
                             All the borders :)
   Modifiers for <*BORDER>:
       red, green, blue             color
       style="0"                    linestyle
                                    0...solid
                                    1...dash
                                    2...dot
                                    3...dash-dot-dash-dot-...
                                    4...dash-dot-dot-dash-dot-dot-...
       width="0"                    width (in pt)

<FORMAT>                     The format of this "style" (see <FORMAT>)
 <COLOR red="0" green="0" blue="0"/>
 <FONT name="times"/>
 <SIZE value="12"/>
 <WEIGHT value="50"/>
 <ITALIC value="0"/>
 <UNDERLINE value="0"/>
 <VERTALIGN value="0"/>
</FORMAT> 

<TABULATOR mmpos="64.2055" ptpos="182" inchpos="2.52778" type="0"/>
                             Defines a tabulator at the specific position
   Modifiers for <TAB..>:
       type="0"                     0...left
                                    1...center
                                    2...right
                                    3...decimal point

The <STYLES> are a region where the different style types are defined. The 
<STYLE> "syntax" is equal to the <LAYOUT> syntax. After the </STYLES> tag 
there can be <CLIPARTS> and <PIXMAPS>.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic