From koffice Mon Jan 24 21:19:10 2000 From: Werner Trobin Date: Mon, 24 Jan 2000 21:19:10 +0000 To: koffice Subject: Fulfill you request (Filters) X-MARC-Message: https://marc.info/?l=koffice&m=94875868321615 MIME-Version: 1 Content-Type: multipart/mixed; boundary="--------------B1F5B5EF8BCCF90AECFCBEC9" Dies ist eine mehrteilige Nachricht im MIME-Format. --------------B1F5B5EF8BCCF90AECFCBEC9 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi! You said you want is - so you get it. The reason that I don't like sending that to the list is, that it's really long... Please don't flame me for that :) Werner KOFilter template mail :) ############################################################################# As the filters are a separate library you won't learn much about KDE programming (only a little bit of Qt - the QTL - the replacement for the STL). If you still want to help me just read on :) I don't know if you already use KDE 2.0 - anyway, I'll give you a short install guide: I'd like to start by explaining the basic rules for installing KDE 2 (parallel to KDE-1.x.x!) because KOffice needs the new KDE 2. Note: This is one possibility of hundreds and works for me on my SuSE Linux 6.1 (should work on most Distros, though): - Create a new user (it's cleaner that way). Use this user for all the following stuff - it will be your KDE 2 user. As you don't change the stuff for all the other users they will use KDE-1.x.x - Download a current Qt-2.1pre snapshot (somewhere on ftp://ftp.troll.no). To update it from time to time you might want to install rsync (http://rsync.samba.org). To get the newest Qt without downloading the (quite large) snapshots, just type: rsync -a -v -z rsync.troll.no::qt /your/local/qt-2.1/path - Set QTDIR, PATH, LD_LIBRARY_PATH,... in the '.bashrc' of your KDE 2 user (Simply add these lines and adjust the path (i.e. replace 'foo')): ### KDE 2 stuff ########################################### export WINDOWMANAGER=/home/foo/kde/bin/startkde export KDEDIR=/home/foo/kde export QTDIR=/home/foo/qt export LD_LIBRARY_PATH=/home/foo/qt/lib:/home/foo/kde/lib export LD_RUN_PATH=/home/foo/qt/lib:/home/foo/kde/lib export PATH=/home/foo/kde/bin:/home/foo/qt/bin:$PATH ########################################################### - Compile Qt and make sure that this Qt replaces your standard Qt-1.4x. This is esp. important because of the new moc version! If the following stuff doesn't work, please check if you use the correct moc. - You may have to download & install autoconf-2.13 and automake-1.4. (Should be standard, but if you need it, it's on ftp://ftp.gnu.org/auto*) - Get at least the packages kdesupport, kdelibs, and koffice maybe you'd like to use kdebase, too. I suggest using CVSup to get the snapshots (http://www.kde.org/cvsup.html), but you may download them from the ftp server, too. This is easier, but if you want to update the installation it takes longer as you have to download everything all the time. (ftp://ftp.kde.org/pub/kde/unstable/CVS/snapshots/current/) - Compile and install these packages in the correct order (support, libs, base, office). Note: I have to use './configure --with-qt-libraries= /my/local/qt/path' to make that link corectly! - You might have to edit your startx script and do some additional PATH stuff depending on your distro... (i.e. make sure that the 'startkde' script from $KDEDIR/bin is used) - Have fun with KDE-2.0pre Alpha. Note: You have to logout with Ctrl+Alt+ Backspace as this is broken at the moment... So...now that your KDE-2.0pre Alpha is up and running we can have a look at the filter stuff. If you are interested in the technical background, please have a look at the koffice/filters/HOWTO file, browse through the sources or ask me. The WinWord filter (koffice/filters/olefilters/winword) is in a very early stage, but if I have some time in the next weeks it should be able to import simple formatted text (i.e. bold, italic, chapters, headlines, tables,...). If you need information on the structure, just ask me. The main problem is that it's quite difficult to get (correct) documentation. Have a look at http://msdn.microsoft.com/ - AFAIK there is some info. If you can't find it there http://www.wotsit.org might help. There is also some "inspiration" out there (the wv-library from Caolan McNamara, http://skynet.csn.ul.ie/~caolan) The Excel filter is developed by Percy Leonhardt (percy@linuxfreak.com). This filter is quite usable because Excel uses simple records to store the information. Therefore you just leave the unknown records out and decode the important ones. IMHO it's very easy to join Percy because the records are quite independant so more people can work on that at the same time. The Powerpoint filter is just a template at the moment, but I think this filter is not *that* important right now. A very important non M$ filter is RTF import/export. Maybe you want to start developing that. Note: IIRC someone already started a RTF import filter - please ask on the KOffice mailinglist! If you have some code you want to check in, questions, suggestions, flames,... feel free to mail. Have fun, Werner ###################################################################### --------------B1F5B5EF8BCCF90AECFCBEC9 Content-Type: text/plain; charset=us-ascii; name="kword.xml.spec" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kword.xml.spec" ------------------------------------------------------------------------------- - - - KWord - XML File Format Description v0.0.3 - - - - by Werner, wtrobin@carinthia.com - last changes: 19.09.99 - - Please be so kind to report all the errors, typos,... you'll surely find! - - Beware: The KWord-format is moving quite fast these days (thanks to - - Reggie for his great work) so some information might be outdated! - ------------------------------------------------------------------------------- The KWord file format is a (more or less :) human-readable XML format. This means it consists of HTML-like tags which define the document structure and the contents. The main structure of each KWord document is a header and a body. In the header-part things like the paper size, the author,... are stored. I'd like to start with an example to explain the contents of the body. You might have noticed that you can do nearly everything with the frames in your KWord document (e.g. move them around, interlock them, let your text "flow" through them,...). To achieve this flexibility KWord has to store the data in a very well defined structure including framesets, frames, paragraphs, and so on. As you might remember you were able to choose between templates for "DTPing" and simple "Wordprocessing" right in the beginning (after launching the killer- app). The only difference is that KWord offers some help in managing the layout of the first frame in Wordprocessing-mode (in most cases you'll only have one). Due to that fact "DTPing" offers more flexibility and "Wordprocessing" works almost automagically :) Some basic notes: ----------------- - All kinds of numbers are stored like this: foo="1" (between " and " :) - A rational number looks like this: width="1.03" (note: the '.' is used as "comma") - equals - Unicode-letters (UTF-8 compressed) are used to store the text - Some special characters ('<', '>') are "escaped" ('<', '>') - Please launch Kword and save an empty file - it is much easier to follow this documentation if you wade trough an (almost empty) examle document :) The tags: --------- Each file starts with this tag. Note: You must not "close" this tag (i.e. don't put a at the end of the file!) , Like . It opens/closes the whole doc. Therefore each of them is only used once. Modifiers for : author="Your Name" Shouldn't be a problem :) email="you@home..." Should neither be editor="KWord" Name of the editor which has saved the file mime="application/x-kword" Mimetype of the file (i.e. which app to launch if the you click on the icon representing one of your documents) , Is used to define the properties of the paper. Normally this is the first tag in the "header". Modifiers for : format="1" 0...DIN A3 1...DIN A4 2...DIN A5 3...US LETTER 4...US LEGAL 5...SCREEN (screen sized) 6...CUSTOM (just enter your prefered size) 7...DIN B5 8...US EXECUTIVE ptWidth="595" Width of the page in pt ptHeight="841" Height of the page in pt mmWidth ="210" Same in mm mmHeight="297" Same in mm inchWidth ="8.26772" Same in inch inchHeight="11.6929" Same in inch orientation="0" 0...Portrait 1...Landscape columns="1" Number of columns ptColumnspc="3" Spacing between columns in pt mmColumnspc="1.05833" Same in mm inchColumnspc="0.0416667" Same in inch hType="0" 0...On all pages (even/odd) the same headers 1...Different header only on first page 2...Different headers for even/odd pages fType="0" See hType, header -> footer :) ptHeadBody="9" Distance between header and body in pt ptFootBody="9" Distance betwenn footer and body in pt mmHeadBody="3.5" Same in mm mmFootBody="3.5" Same in mm inchHeadBody="0.137795" Same in inch inchFootBody="0.137795" Same in inch , Used to specify the borders of the . Should only be used within and ! Modifiers for : mmLeft="0" This mmTop="0" should mmRight="0" be mmBottom="0" quite ptLeft="0" self ptTop="0" explanatory :) ptRight="0" ptBottom="0" inchLeft="0" inchTop="0" inchRight="0" inchBottom="0" , Some basic settings Modifiers for : processing="1" 0..."Normal" document (Wordprocessing) 1...DTP-document (DTPing) standardpage="1" There can be only "1" :) hasHeader="0" Is there a header? (0/1) hasFooter="0" Is there a footer? (0/1) unit="mm" Basic unit for positioning, ruler,... , Information for the Footnote-Manager Modifiers for : none This tag stores the value of the first footnote (e.g. "1" means that the first footnote looks like that: [1]) Modifiers for : value="1" explained above , Used to store the formatting options for the footnote. Note: This one must not be used outside the tags! Modifiers for : superscript="1" [???] type="1" [???] , Modifiers for : ref="(null)" The name of the corresponding paragraph. , With this tag you open/close the "frame-section". All your FRAMESETs (notice the small s!) are placed inside (and nowhere else!) Modifiers for : none , This tag defines one frameset. A frameset consists of (at least) one FRAME and one PARAGRAPH. Modifiers for : frameType="1" 0...Base frame (for internal use only!!!) 1...Text frame 2...Picture frame 3...Part frame (e.g. KImage-Part) autoCreateNewFrame="1" Whether KWord should create a new frame if there is no space left in the old one. 0/1 frameInfo="0" 0...Body 1...First header 2...Odd header 3...Even header 4...First footer 5...Odd footer 6...Even footer grpMgr="grpmgr_0" The name of the group manager for this table. (i.e. If this frameset "belongs" to a table the position and the size are contolled by a group manager (one for each table)) row="0" Position in the table (only for "table- frames"). Index starts at 0. col="1" Just guess :) removeable="0" Whether the header-frame is removable or not (notice the typo!). 0/1 [???] , Describes the position, property,... of one FRAME. Note: The tag is used like this: i.e. there are no other tags in between... Modifiers for : left="28" Those four modifiers (left, top, right, top="42" bottom) describe the size and the position right="566" of the frame (absolut to the paper). bottom="798" Note: measured in pt! runaround="1" 0...Don't run around frame 1...Run around bounding rectangle 2...Run around contur runaGapPT="2" Run around with gap (in pt) runaGapMM="1" Same in mm runaGapINCH="0.0393701" Same in inch lWidth="1" Note: Description for all borders (xWidth, lRed="255" xRed, xGreen, xBlue,...) lGreen="255" lBlue="255" xWidth...Width of border (pt) lStyle="0" xRed, xGreen, xBlue...RGB triplet -> color rWidth="1" of border (e.g. 255, 255, 255 -> white) rRed="255" xStyle...Style of the border-line: rGreen="255" 0...Solid rBlue="255" 1...Dash rStyle="0" 2...Dot tWidth="1" 3...Dash-Dot tRed="255" 4...Dash-Dot-Dot tGreen="255" tBlue="255" x==l -> left border tStyle="0" x==r -> right border bWidth="1" x==t -> top border bRed="255" x==b -> bottom border bGreen="255" bBlue="255" bStyle="0" bkRed="255" RGB-triplet of background color bkGreen="255" bkBlue="255" bleftpt="0" Distance: left border - text/picture in pt bleftmm="0" Same in mm bleftinch="0" Same in inch brightpt="0" Distance: right border - text/pictute in pt brightmm="0" Same in mm brightinch="0" Same in inch btoppt="0" Distance: top border - text/picture in pt btopmm="0" Same in mm btopinch="0" Same in inch bbottompt="0" Distance: bottom border - text/pic. in pt bbottommm="0" Same in mm bbottominch="0" Same in inch , All the information for each paragraph (text, color(s), format(s),...) is stored between these two tags. Each FRAMESET may contain as many PARAGRAPH tags as you want to. Modifiers for : none , Just guess :) Currently the text is stroed as UTF-8 compressed Unicode glyphs. Note: the format-tags navigate in the text using an index which starts at 0 and runs up till it reaches length-1. Also the length of the text is used to express format-information. Modifiers for : none , The name of the paragraph only used if it is a footnote (see ). Modifiers for : name="Footnote/Endnote_1" Just the name of the paragraph where the footnote belongs to , Some paragraph information (will be extended?) Modifiers for : info="0" 0...No "special" information 1...Footnote (see ) , Normally the text "flows" trough all the frames. Sometimes you want to define a hard break to the next frame (page); i.e. this and the following paragraphs start in a new frame, even if there is enough space in the last one. Modifiers for : frame="0" 0...let it flow 1...hard brake -> next frame , The text is stored plain. All the formatting is configured between these two tags. Modifiers for : none , These tags describe "runs of text" which share the same fromatting properties. Modifiers for : id="1" 0...none (mustn't be in a file) 1..."normal" text 2...a picture 3...tabulator 4...a variable 5...a footnote pos="0" position in the blabla stream (of course 0-based :) len="5" length of the "run of text" which is formatted using this format (Note: is not stored for picture-fromats (id="2") a typical text FORMAT used to store the text-color guess :) once again... 50 - normal, 75 - bold,... guess :) 0/1 0...normal 1...sub 2...super Note: KWord stores only a link to the real picture! (This should/will change as soon as we use the new storage spec and the new Image Container class) Note: KWord stores a 0-char at the position of the image! Note: KWord stores a 0-char at this pos.! Note: There is no additional info. a variable (e.g. page number, date,...) Note: KWord stores a 0-char at this pos.! 0...date (fix) 1...date (variable) 2...time (fix) 3...time (variable) 4...page number location of the variable (should be easy to "decode") just guess :) this is like the id="1" part for formatting the variable's text a footnote. Note: Once again a 0-char is stored at this position. [???] [???] guess name (reference) to describe the formatting , KWord supports "Styles" and it stores them at the end of the file (STYLES). To use them for each paragraph it has to store them there, too. The reason for that is that you can change the style of one single paragraph without changing the whole document's style the new paragraph-style is based on. Modifiers for : none , Name of the Style this paragraph style is based on Modifiers for : value="Standard" The name :) , Name of the style of the following paragraph Modifiers for : name="Standard" Once again - the name , The alignment of this paragraph. Modifiers for : value="0" 0...left 1...right 2...center 3...block , Distance to the last paragraph Modifiers for : pt="0" mm="0" inch="0" , Distance to the next paragraph Modifiers for : pt="0" mm="0" inch="0" Indent of the first Line Left indent Spacing between the lines of the p. , Describes the counter-type used for this p. Modifiers for : type="0" 0...none 1...arabic numbers (e.g. "1") 2...low letter (e.g. "a") 3...captial letter (e.g. "A") 4...low letters roman number (e.g. "ii") 5...cap. letters roman number (e.g. "IX") 6...bullet (e.g. "-") depth="0" 0..."1", 1..."1.1" 2..."1.1.1",... bullet="176" bullet character start="1" value to start with numberingtype="1" 0...list 1...chapter (added to doc-structure) lefttext="" special text left to counter righttext="" special text right to counter (e.g. ".") bulletfont="times" font for bullet-char All the borders :) Modifiers for <*BORDER>: red, green, blue color style="0" linestyle 0...solid 1...dash 2...dot 3...dash-dot-dash-dot-... 4...dash-dot-dot-dash-dot-dot-... width="0" width (in pt) The format of this "style" (see ) Defines a tabulator at the specific position Modifiers for : type="0" 0...left 1...center 2...right 3...decimal point The are a region where the different style types are defined. The