[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice
Subject:    Re: Indexing KWord Documents
From:       Thomas <zander () xs4all ! nl>
Date:       2000-10-11 8:23:46
[Download RAW message or body]

Kword (and all koffice applications) use XML to save their files. This
has the advantage that you can easily create an parser to read kword
files and use only the text (for publishing in html for example).

So what you need is an application which can parse basic XML, and can
parse kword docs (really easy). Either do that on the fly (while searching)
or use that method to create a database you can search in.

The hard part will be to extract the xml file from the tar file kword 
writes.

Let me know what you want to do, i'll be able to get you going.

Cheers!

> Hi folks,
> 
> I am about to ask a question that may be terrible off topic. Please be
> patient and try to give me some directions...
> 
> I work in a Company that mainly deals with documents. Nowadays we use
> MS-Word to produce such documents. They are grouped on a server and we have
> Microsoft Index Server (which runs on top of Microsoft IIS) to create a
> catalog upon these ".doc" files. So we are able to perform a text search,
> based on the contents of the ".doc" files and have them exihibited over our
> Intranet.
> 
> We plan to use KOffice's Kword in substitution to MS-Word. Nevertheless, we
> want do make available to our users the text search feature I have mentioned
> above.
> 
> Is there a product on the market, free or not, that can catalog Kword
> formated files so that they can be searched and published on an Intranet. We
> think of apache to publish the pages, but I can't figure how can we provide
> the text search capability.
 
-- 
Thomas Zander                                            zander@earthling.net
The only thing worse than failure is the fear of trying something new

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic