[prev in list] [next in list] [prev in thread] [next in thread]
List: koffice
Subject: Fulfill you request (Filters)
From: Werner Trobin <wtrobin () carinthia ! com>
Date: 2000-01-24 21:19:10
[Download RAW message or body]
Dies ist eine mehrteilige Nachricht im MIME-Format.
Hi!
You said you want is - so you get it. The reason that I don't like
sending that to the list is, that it's really long...
Please don't flame me for that :)
Werner
KOFilter template mail :)
#############################################################################
As the filters are a separate library you won't learn much about
KDE programming (only a little bit of Qt - the QTL - the replacement
for the STL). If you still want to help me just read on :)
I don't know if you already use KDE 2.0 - anyway, I'll give you a
short install guide:
I'd like to start by explaining the basic rules for installing KDE 2
(parallel to KDE-1.x.x!) because KOffice needs the new KDE 2.
Note: This is one possibility of hundreds and works for me on my SuSE
Linux 6.1 (should work on most Distros, though):
- Create a new user (it's cleaner that way). Use this user for all the
following stuff - it will be your KDE 2 user. As you don't change the
stuff for all the other users they will use KDE-1.x.x
- Download a current Qt-2.1pre snapshot (somewhere on ftp://ftp.troll.no).
To update it from time to time you might want to install rsync
(http://rsync.samba.org). To get the newest Qt without downloading the
(quite large) snapshots, just type:
rsync -a -v -z rsync.troll.no::qt /your/local/qt-2.1/path
- Set QTDIR, PATH, LD_LIBRARY_PATH,... in the '.bashrc' of your KDE 2
user (Simply add these lines and adjust the path (i.e. replace 'foo')):
### KDE 2 stuff ###########################################
export WINDOWMANAGER=/home/foo/kde/bin/startkde
export KDEDIR=/home/foo/kde
export QTDIR=/home/foo/qt
export LD_LIBRARY_PATH=/home/foo/qt/lib:/home/foo/kde/lib
export LD_RUN_PATH=/home/foo/qt/lib:/home/foo/kde/lib
export PATH=/home/foo/kde/bin:/home/foo/qt/bin:$PATH
###########################################################
- Compile Qt and make sure that this Qt replaces your standard Qt-1.4x.
This is esp. important because of the new moc version! If the following
stuff doesn't work, please check if you use the correct moc.
- You may have to download & install autoconf-2.13 and automake-1.4.
(Should be standard, but if you need it, it's on ftp://ftp.gnu.org/auto*)
- Get at least the packages kdesupport, kdelibs, and koffice maybe
you'd like to use kdebase, too. I suggest using CVSup to get the
snapshots (http://www.kde.org/cvsup.html), but you may download them
from the ftp server, too. This is easier, but if you want to update the
installation it takes longer as you have to download everything all
the time. (ftp://ftp.kde.org/pub/kde/unstable/CVS/snapshots/current/)
- Compile and install these packages in the correct order (support, libs,
base, office). Note: I have to use './configure --with-qt-libraries=
/my/local/qt/path' to make that link corectly!
- You might have to edit your startx script and do some additional PATH
stuff depending on your distro... (i.e. make sure that the 'startkde'
script from $KDEDIR/bin is used)
- Have fun with KDE-2.0pre Alpha. Note: You have to logout with Ctrl+Alt+
Backspace as this is broken at the moment...
So...now that your KDE-2.0pre Alpha is up and running we can have a look
at the filter stuff. If you are interested in the technical background,
please have a look at the koffice/filters/HOWTO file, browse through the
sources or ask me.
The WinWord filter (koffice/filters/olefilters/winword) is in a very
early stage, but if I have some time in the next weeks it should be able
to import simple formatted text (i.e. bold, italic, chapters, headlines,
tables,...). If you need information on the structure, just ask me.
The main problem is that it's quite difficult to get (correct)
documentation. Have a look at http://msdn.microsoft.com/ - AFAIK there is
some info. If you can't find it there http://www.wotsit.org might help.
There is also some "inspiration" out there (the wv-library from Caolan
McNamara, http://skynet.csn.ul.ie/~caolan)
The Excel filter is developed by Percy Leonhardt (percy@linuxfreak.com).
This filter is quite usable because Excel uses simple records to store
the information. Therefore you just leave the unknown records out and
decode the important ones. IMHO it's very easy to join Percy because
the records are quite independant so more people can work on that at
the same time.
The Powerpoint filter is just a template at the moment, but I think
this filter is not *that* important right now.
A very important non M$ filter is RTF import/export. Maybe you want
to start developing that. Note: IIRC someone already started a RTF
import filter - please ask on the KOffice mailinglist!
If you have some code you want to check in, questions, suggestions,
flames,... feel free to mail.
Have fun,
Werner
######################################################################
["kword.xml.spec" (text/plain)]
-------------------------------------------------------------------------------
- -
- KWord - XML File Format Description v0.0.3 -
- -
- by Werner, wtrobin@carinthia.com - last changes: 19.09.99 -
- Please be so kind to report all the errors, typos,... you'll surely find! -
- Beware: The KWord-format is moving quite fast these days (thanks to -
- Reggie for his great work) so some information might be outdated! -
-------------------------------------------------------------------------------
The KWord file format is a (more or less :) human-readable XML format. This
means it consists of HTML-like tags which define the document structure and
the contents. The main structure of each KWord document is a header and a body.
In the header-part things like the paper size, the author,... are stored.
I'd like to start with an example to explain the contents of the body. You
might have noticed that you can do nearly everything with the frames in your
KWord document (e.g. move them around, interlock them, let your text "flow"
through them,...). To achieve this flexibility KWord has to store the data in
a very well defined structure including framesets, frames, paragraphs, and so
on.
As you might remember you were able to choose between templates for "DTPing"
and simple "Wordprocessing" right in the beginning (after launching the killer-
app). The only difference is that KWord offers some help in managing the layout
of the first frame in Wordprocessing-mode (in most cases you'll only have one).
Due to that fact "DTPing" offers more flexibility and "Wordprocessing" works
almost automagically :)
Some basic notes:
-----------------
- All kinds of numbers are stored like this: foo="1" (between " and " :)
- A rational number looks like this: width="1.03" (note: the '.' is used as
"comma")
- <XYZ foo="100" bar="0"/> equals <XYZ foo="100" bar="0"></XYZ>
- Unicode-letters (UTF-8 compressed) are used to store the text
- Some special characters ('<', '>') are "escaped" ('<', '>')
- Please launch Kword and save an empty file - it is much easier to follow this
documentation if you wade trough an (almost empty) examle document :)
The tags:
---------
<?xml version="1.0" encoding="UTF-8"?> Each file starts with this tag.
Note: You must not "close" this tag
(i.e. don't put a </?xml...> at the end of the
file!)
<DOC>, </DOC> Like <HTML></HTML>. It opens/closes the whole doc.
Therefore each of them is only used once.
Modifiers for <DOC>:
author="Your Name" Shouldn't be a problem :)
email="you@home..." Should neither be
editor="KWord" Name of the editor which has saved the file
mime="application/x-kword" Mimetype of the file (i.e. which app to
launch if the you click on the icon
representing one of your documents)
<PAPER>, </PAPER> Is used to define the properties of the paper.
Normally this is the first tag in the "header".
Modifiers for <PAPER>:
format="1" 0...DIN A3
1...DIN A4
2...DIN A5
3...US LETTER
4...US LEGAL
5...SCREEN (screen sized)
6...CUSTOM (just enter your prefered size)
7...DIN B5
8...US EXECUTIVE
ptWidth="595" Width of the page in pt
ptHeight="841" Height of the page in pt
mmWidth ="210" Same in mm
mmHeight="297" Same in mm
inchWidth ="8.26772" Same in inch
inchHeight="11.6929" Same in inch
orientation="0" 0...Portrait
1...Landscape
columns="1" Number of columns
ptColumnspc="3" Spacing between columns in pt
mmColumnspc="1.05833" Same in mm
inchColumnspc="0.0416667" Same in inch
hType="0" 0...On all pages (even/odd) the same
headers
1...Different header only on first page
2...Different headers for even/odd pages
fType="0" See hType, header -> footer :)
ptHeadBody="9" Distance between header and body in pt
ptFootBody="9" Distance betwenn footer and body in pt
mmHeadBody="3.5" Same in mm
mmFootBody="3.5" Same in mm
inchHeadBody="0.137795" Same in inch
inchFootBody="0.137795" Same in inch
<PAPERBORDERS>, </PA...> Used to specify the borders of the <PAPER>. Should
only be used within <PAPER> and </PAPER>!
Modifiers for <PA...>:
mmLeft="0" This
mmTop="0" should
mmRight="0" be
mmBottom="0" quite
ptLeft="0" self
ptTop="0" explanatory :)
ptRight="0"
ptBottom="0"
inchLeft="0"
inchTop="0"
inchRight="0"
inchBottom="0"
<ATTRIBUTES>, </ATT...> Some basic settings
Modifiers for <ATTRIBUTES>:
processing="1" 0..."Normal" document (Wordprocessing)
1...DTP-document (DTPing)
standardpage="1" There can be only "1" :)
hasHeader="0" Is there a header? (0/1)
hasFooter="0" Is there a footer? (0/1)
unit="mm" Basic unit for positioning, ruler,...
<FOOTNOTEMGR>, </FOO...> Information for the Footnote-Manager
Modifiers for <FOOTNOTEMGR>:
none
<START value="1"/> This tag stores the value of the first footnote
(e.g. "1" means that the first footnote looks
like that: [1])
Modifiers for <START>:
value="1" explained above
<FORMAT>, </FORMAT> Used to store the formatting options for the
footnote. Note: This one must not be used outside
the <FOOTNOTEMGR> tags!
Modifiers for <FORMAT>:
superscript="1" [???]
type="1" [???]
<FIRSTPARAG>, </FIRSTPARAG>
Modifiers for <FIRSTPARAG>:
ref="(null)" The name of the corresponding paragraph.
<FRAMESETS>, </FRAMESETS> With this tag you open/close the "frame-section".
All your FRAMESETs (notice the small s!) are
placed inside (and nowhere else!)
Modifiers for <FRAMESETS>:
none
<FRAMESET>, </FRAMESET> This tag defines one frameset. A frameset consists
of (at least) one FRAME and one PARAGRAPH.
Modifiers for <FAMESET>:
frameType="1" 0...Base frame (for internal use only!!!)
1...Text frame
2...Picture frame
3...Part frame (e.g. KImage-Part)
autoCreateNewFrame="1" Whether KWord should create a new frame if
there is no space left in the old one. 0/1
frameInfo="0" 0...Body
1...First header
2...Odd header
3...Even header
4...First footer
5...Odd footer
6...Even footer
grpMgr="grpmgr_0" The name of the group manager for this
table. (i.e. If this frameset "belongs" to
a table the position and the size are
contolled by a group manager (one for each
table))
row="0" Position in the table (only for "table-
frames"). Index starts at 0.
col="1" Just guess :)
removeable="0" Whether the header-frame is removable or
not (notice the typo!). 0/1 [???]
<FRAME>, </FRAME> Describes the position, property,... of one FRAME.
Note: The <FRAME> tag is used like this:
<FRAME [modifiers] /> i.e. there are no other tags
in between...
Modifiers for <FRAME>:
left="28" Those four modifiers (left, top, right,
top="42" bottom) describe the size and the position
right="566" of the frame (absolut to the paper).
bottom="798" Note: measured in pt!
runaround="1" 0...Don't run around frame
1...Run around bounding rectangle
2...Run around contur
runaGapPT="2" Run around with gap (in pt)
runaGapMM="1" Same in mm
runaGapINCH="0.0393701" Same in inch
lWidth="1" Note: Description for all borders (xWidth,
lRed="255" xRed, xGreen, xBlue,...)
lGreen="255"
lBlue="255" xWidth...Width of border (pt)
lStyle="0" xRed, xGreen, xBlue...RGB triplet -> color
rWidth="1" of border (e.g. 255, 255, 255 -> white)
rRed="255" xStyle...Style of the border-line:
rGreen="255" 0...Solid
rBlue="255" 1...Dash
rStyle="0" 2...Dot
tWidth="1" 3...Dash-Dot
tRed="255" 4...Dash-Dot-Dot
tGreen="255"
tBlue="255" x==l -> left border
tStyle="0" x==r -> right border
bWidth="1" x==t -> top border
bRed="255" x==b -> bottom border
bGreen="255"
bBlue="255"
bStyle="0"
bkRed="255" RGB-triplet of background color
bkGreen="255"
bkBlue="255"
bleftpt="0" Distance: left border - text/picture in pt
bleftmm="0" Same in mm
bleftinch="0" Same in inch
brightpt="0" Distance: right border - text/pictute in pt
brightmm="0" Same in mm
brightinch="0" Same in inch
btoppt="0" Distance: top border - text/picture in pt
btopmm="0" Same in mm
btopinch="0" Same in inch
bbottompt="0" Distance: bottom border - text/pic. in pt
bbottommm="0" Same in mm
bbottominch="0" Same in inch
<PARAGRAPH>, </PARAGRAPH> All the information for each paragraph (text,
color(s), format(s),...) is stored between these
two tags. Each FRAMESET may contain as many
PARAGRAPH tags as you want to.
Modifiers for <PARAGRAPH>:
none
<TEXT>, </TEXT> Just guess :) Currently the text is stroed as
UTF-8 compressed Unicode glyphs.
Note: the format-tags navigate in the text using
an index which starts at 0 and runs up till it
reaches length-1. Also the length of the text is
used to express format-information.
Modifiers for <TEXT>:
none
<NAME>, </NAME> The name of the paragraph only used if it is a
footnote (see <INFO>).
Modifiers for <NAME>:
name="Footnote/Endnote_1" Just the name of the paragraph where the
footnote belongs to
<INFO>, </INFO> Some paragraph information (will be extended?)
Modifiers for <INFO>:
info="0" 0...No "special" information
1...Footnote (see <NAME>)
<HARDBRK>, </HARDBRK> Normally the text "flows" trough all the frames.
Sometimes you want to define a hard break to the
next frame (page); i.e. this and the following
paragraphs start in a new frame, even if there is
enough space in the last one.
Modifiers for <HARDBRK>:
frame="0" 0...let it flow
1...hard brake -> next frame
<FORMATS>, </FORMATS> The text is stored plain. All the formatting is
configured between these two tags.
Modifiers for <FORMATS>:
none
<FORMAT>, </FORMAT> These tags describe "runs of text" which share the
same fromatting properties.
Modifiers for <FORMAT>:
id="1" 0...none (mustn't be in a file)
1..."normal" text
2...a picture
3...tabulator
4...a variable
5...a footnote
pos="0" position in the <TEXT>blabla</TEXT> stream
(of course 0-based :)
len="5" length of the "run of text" which is
formatted using this format (Note: is not
stored for picture-fromats (id="2")
<FORMAT id="1" ...> a typical text FORMAT
<COLOR red="0" green="0" blue="0"/> used to store the text-color
<FONT name="times"/> guess :)
<SIZE value="12"/> once again...
<WEIGTH value="50"/> 50 - normal, 75 - bold,...
<UNDERLINE value="0"/> guess :) 0/1
<VERTALIGN value="0"/> 0...normal
1...sub
2...super
<FORMAT id="2"...> Note: KWord stores only a link to the real
picture! (This should/will change as soon
as we use the new storage spec and the new
Image Container class)
Note: KWord stores a 0-char at the position
of the image!
<FILENAME value="/home/..."/>
<FORMAT id="3"...> Note: KWord stores a 0-char at this pos.!
Note: There is no additional info.
<FORMAT id="4"...> a variable (e.g. page number, date,...)
Note: KWord stores a 0-char at this pos.!
<TYPE type="0"/> 0...date (fix)
1...date (variable)
2...time (fix)
3...time (variable)
4...page number
<POS frameSet="1" frame="1" pageNum="1"/>
location of the variable
(should be easy to "decode")
<DATE year="1999" month="8" day="22" fix="1"/>
just guess :)
<FRMAT> this is like the id="1" part
for formatting the variable's
text
<COLOR red="0" green="0" blue="0"/>
<FONT name="times"/>
<SIZE value="12"/>
<WEIGHT value="50"/>
<ITALIC value="0"/>
<UNDERLINE value="0"/>
<VERTALIGN value="0"/>
</FRMAT>
<FORMAT id="5"...> a footnote. Note: Once again a 0-char is
stored at this position.
<INTERNAL> [???]
<PART from="1" to="-1" space="-"/>
</INTERNAL>
<RANGE start="1" end="1"/> [???]
<TEXT before="[ " after=" ]"/> guess
<DESCRIPT ref="Footnote/Endnote_1"/> name (reference)
<FRMAT> to describe the formatting
<COLOR red="0" green="0" blue="0"/>
<FONT name="times"/>
<SIZE value="12"/>
<WEIGHT value="50"/>
<ITALIC value="0"/>
<UNDERLINE value="0"/>
<VERTALIGN value="2"/>
</FRMAT>
<LAYOUT>, </LAYOUT> KWord supports "Styles" and it stores them at the
end of the file (STYLES). To use them for each
paragraph it has to store them there, too. The
reason for that is that you can change the style
of one single paragraph without changing the whole
document's style the new paragraph-style is based
on.
Modifiers for <LAYOUT>:
none
<NAME>, </NAME> Name of the Style this paragraph style is based
on
Modifiers for <NAME>:
value="Standard" The name :)
<FOLLOWING>, </FOLLOWING> Name of the style of the following paragraph
Modifiers for <FOLLOWING>:
name="Standard" Once again - the name
<FLOW>, </FLOW> The alignment of this paragraph.
Modifiers for <FLOW>:
value="0" 0...left
1...right
2...center
3...block
<OHEAD>, </OHEAD> Distance to the last paragraph
Modifiers for <OHEAD>:
pt="0"
mm="0"
inch="0"
<OFOOT>, </OFOOT> Distance to the next paragraph
Modifiers for <OFOOT>:
pt="0"
mm="0"
inch="0"
<IFIRST pt="0" mm="0" inch="0"/> Indent of the first Line
<ILEFT pt="0" mm="0" inch="0"/> Left indent
<LINESPACE pt="0" mm="0" inch="0"/> Spacing between the lines of the p.
<COUNTER>, </COUNTER> Describes the counter-type used for this p.
Modifiers for <COUNTER>:
type="0" 0...none
1...arabic numbers (e.g. "1")
2...low letter (e.g. "a")
3...captial letter (e.g. "A")
4...low letters roman number (e.g. "ii")
5...cap. letters roman number (e.g. "IX")
6...bullet (e.g. "-")
depth="0" 0..."1", 1..."1.1" 2..."1.1.1",...
bullet="176" bullet character
start="1" value to start with
numberingtype="1" 0...list
1...chapter (added to doc-structure)
lefttext="" special text left to counter
righttext="" special text right to counter (e.g. ".")
bulletfont="times" font for bullet-char
<LEFTBORDER red="255" green="255" blue="255" style="0" width="0"/>
<RIGHTBORDER red="255" green="255" blue="255" style="0" width="0"/>
<TOPBORDER red="255" green="255" blue="255" style="0" width="0"/>
<BOTTOMBORDER red="255" green="255" blue="255" style="0" width="0"/>
All the borders :)
Modifiers for <*BORDER>:
red, green, blue color
style="0" linestyle
0...solid
1...dash
2...dot
3...dash-dot-dash-dot-...
4...dash-dot-dot-dash-dot-dot-...
width="0" width (in pt)
<FORMAT> The format of this "style" (see <FORMAT>)
<COLOR red="0" green="0" blue="0"/>
<FONT name="times"/>
<SIZE value="12"/>
<WEIGHT value="50"/>
<ITALIC value="0"/>
<UNDERLINE value="0"/>
<VERTALIGN value="0"/>
</FORMAT>
<TABULATOR mmpos="64.2055" ptpos="182" inchpos="2.52778" type="0"/>
Defines a tabulator at the specific position
Modifiers for <TAB..>:
type="0" 0...left
1...center
2...right
3...decimal point
The <STYLES> are a region where the different style types are defined. The
<STYLE> "syntax" is equal to the <LAYOUT> syntax. After the </STYLES> tag
there can be <CLIPARTS> and <PIXMAPS>.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic