[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    Fwd: koffice-devel post from erics@cs.kun.nl requires approval
From:       David Faure <faure () kde ! org>
Date:       2003-10-14 16:02:47
[Download RAW message or body]

Hello Mr Schabell,

Thank you for your e-mail and your interest in document conversion.
For information, KOffice plans to switch to the OpenOffice file format once
KOffice-1.3 is out (i.e. NOT for 1.3, but for the next release after that).
This should reduce the amount of necessary file formats to support a little bit :)

On the other hand, several of the existing KOffice filters could probably be
very useful to your project.

David.

----------------------------------------------------------------------------------------


Forwarded message from: "drs. Eric D. Schabell" <erics@cs.kun.nl>  (PRONIR)
To: Abiword project <F.J.Franklin@shef.ac.uk>, Open Office project \
<discuss@openoffice.org>, KOffice project <koffice@mail.kde.org>, KOffice devel \
<koffice-devel@mail.kde.org>, JXTA discussion mailinglists <discuss@jxta.org>, Google \
Api project <api-support@google.com>, hans.bossenbroek@luminis.nl, p.jones@edmond.nl, \
erica@wizwise.nl, woody@dstc.edu.au, andrewg@dstc.edu.au, sparky@cs.kun.nl, \
                PRONIR@NIC.SURFNET.NL, Mark@Overmeer.net, hoppie@uvt.nl
CC: pronir-conversion@cs.kun.nl
Date: Today 15:54:23

Hello everyone,

As lead developer on the PRONIR Document Conversion project I wanted to get
back to you on our progress with regards to our research in the area of
document conversion. This is practical research, with regards to producing a
usable tool and easy to use API interface for external access from existing
projects.

At the bottom of this e-mail you will find our original 'call for interest',
submitted to you for review over a year ago. In the time since we have been
busy researching some of the possibilities with regards to document
conversion and creating the tools/projects listed below (all relevant phases
of the global PRONIR project can be followed in the Scientific Programmers
Workshop at http://www.pronir.nl/pub/spws.

First off, the PRONIR Conversion Clearinghouse has been setup online at:

http://dubyas.sci.kun.nl/cgi-bin/clearinghouse

The goal is to have a central place of storage for the various existing
conversion tools, to be made available for the (currently under development)
DocConversion tool. We have set this up so that it is easy for users to
submit conversion tools and routines that we might not yet have in our
database. Feel free to browse and use the tools we have listed there. Also
see the technical report published about the clearinghouse at:

http://infolab.uvt.nl/people/erics/docs/tech_clearinghouse.pdf

Secondly, we are currently in the beginning stages of our DocConversion
project. We have setup a project site and uploaded the initial framework for
our DocConversion tool called 'docconverison' on Sourceforge:

http://docconversion.sourceforge.net

Here you can take a look at it, participate and suggest improvements as we go
along. Currently at 0.2 version, it is only working on the localhost where
installed. The goals are to make use of the Conversion Clearinghouses
database of conversions, create a smart Broker that can broker document
conversions for clients and create distributed servers for managing broker
requests for document conversions. Feel free to submit comments and feature
request to the project site.

We hope you enjoy the results up to now and will try to keep everyone informed
as we believe that our DocConversion tool will be easy to insert in many
projects that currently only deal with document conversions in passing.

Interested Parties Please feel free to contact the authors via the
DocConversion project site or at:

    pronir-conversion@cs.kun.nl

for further information and collaboration possibilities.

--
Mvg/Regards,

/**
  * drs. Eric D. Schabell
  *  Scientific Programmer - (PRONIR)
  *  CentER Applied Research - Tilburg, The Netherlands
  *
  *  e-mail        : erics@cs.kun.nl
  *  Mobile        : +31 (0)6 543 613 15
  *  PRONIR        : http://www.pronir.nl
  *  DocConversion : http://docconversion.sf.net
  **/


##################################

Document Conversion Systems
---------------------------

A call for interest to the Open Source Software community

H.A. Proper and E.D. Schabell
May 17, 2002


Introduction
=========
Imagine yourself sitting at your computer in the near future, working on your
latest document in your favorite editor. You decide to save the document to a
different format than the standard format used by the editor. You try to
'save as...' another format, but this new format does not exist in your
current editors conversion list. You try 'export' from the options menu and
enter the form you wish to convert to. You receive a message from your editor
that this new format is not available locally but it might be able to search
the Internet for an algorithm that could make the conversion for you. Since
you have a connection to the Internet, feel that it might be time for a
coffee, you answer affirmative and the search begins.

By the time you return from getting that cup of coffee the editor has popped
up a message that the conversion algorithm has been found and applied. It
also ask you if you would like to have this new algorithm added to your local
library of conversion tools? Of course you think, and after giving the
go-ahead your editor reports that the document you were working on has been
converted and the new algorithm has been added to your local library of
conversion tools.

That coffee is tasting even better now that you can further expand on your
document without conversion troubles!


Call for Interest
============
The above scenario might seem a bit far fetched, however, at the moment we
are starting up a research project and associated prototype, in which a core
part will provide the above sketched functionality. In the research project,
the seemless conversion functionality will be used to research & develop
information retrieval systems for heterogeneous data sources. However, as
sketched in the above scenario, such functionality can be used for many,
many, other purposes.

As a first step we are therefore interested in implementing an open &
distributed system for conversions between data object. We aim to set this
up as an Open Source Software project environment, since we feel that

1: this kind of functionality will be usefull to other applications than
   only information retrieval,
2: other research groups, in information retrieval, are likely to be faced
   with similar challenges,

We are, therefore, looking for interested parties that would like to
participate in developing, using and/or testing such a system as described
above.


Vision
=====
At the moment we envision an open distributed system that uses a peer to
peer (p2p) communication strategy as, for example, is used in gnutella. The
system should distinguish between:

- the definition of conversions from one data type to another data type
- the actual implementations of these conversions
- a suitable execution environment for these conversions

This would make the conversion system fairly platform (OS, CPU, Memory)
independent. Searches for appropriate services can be conducted using a p2p
approach.

The conversions we aim to include in the system are sheer endless. They
may include:

- `simple' conversions such as: text to postscript, postscript to text,
  word to text, XML to word, latex to postscript, GIF to tiff, bmp to GIF,
  wav to mp3, etc.
- whole-part selection, for instance, splitting a mailbox into its constituent
  mails, or splitting a mail into subject, header, body or attachment-set,
  etc.
- aspect conversions, such as: a document's full-content to an abstract,
  a document's full-content to a set of keywords, etc.

Conversions may be composed as well. For example, a Word->Text conversion may
be combined with a Full-Text->Abstract-Text conversion to derive an abstract
from a word document. The system should be able to figure out such
combinations automatically.

As you may expect, a powerfull typing mechanism is needed. We are considering
using the Typed Object Model (TOM) from
http://tom.library.upenn.edu/sw/index.html as a starting point.

On top of the conversion infrastructure, a host of plug-ins for editors may
be developed that would allow for seemless import/export in different formats.

Possible Components
=================
Some existing Open Source Software projects may be integrated into the planned
system. An infrastructure for the p2p infrastructure may be provided by:

* JXTA project, which allows for the concept of providing "services".

Pre-existing conversion routines which may be entered into the system as
conversions:

* a2ps, ghostscript, wv, xpdf, psutils, etc...
* openjade, kea, etc...

Interested Parties Please feel free to contact the authors at:
    pronir-conversion@cs.kun.nl
for further information and collaboration possibilities.

_______________________________________________
koffice-devel mailing list
koffice-devel@mail.kde.org
http://mail.kde.org/mailman/listinfo/koffice-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic