[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    Re: pdf import in KWord
From:       Albert Astals Cid <aacid () kde ! org>
Date:       2006-10-17 17:46:33
Message-ID: 200610171946.33890.aacid () kde ! org
[Download RAW message or body]

A Dimarts 17 Octubre 2006 12:39, vàreu escriure:
> On Tuesday 17 October 2006 17:40, Jan Hambrecht wrote:
> > Cyrille Berger wrote:
> > >> Its on the TODO list for quite some time now, but nobody every
> > >> actually started work on it.
> > >
> > > I wanted as a first step to port the current code to use poppler, but
> > > the needed header from poppler aren't part of the official release, so
> > > if you want to use poppler you have to write it from scratch using the
> > > Qt4 binding. And probably extending the API of the Qt4 binding. There
> > > is the same problem for karbon.
> > >
> > > As far as I can see, the poppler-qt4 api gives access to information
> > > about position of text box. But not vector shape which can only be
> > > retrieve as pixeled images.
> > >
> > > For krita the only thing I missed was support for CMYK, but I decided
> > > it wasn't worth the extra fighting with the internal API.
> > >
> > > The best things to do is contact the okular team about those issues.
> >
> > I looked briefly at the poppler source a few month ago and thought that
> > writing a special output device is the way to go.
>
> I wrote some of the Qt4 bindings in poppler. I would have to agree - you
> are never going to get the level of detail you need out of the current Qt4
> bindings (even if I ever get around to finishing the Arthur renderer). All
> the bindings can do is say "render to pixmap".

Wrong, they can give you the position of each character in the page, although 
probably that is not enough either.

> I think the best way is to add ODF support is as a new OutputDev (as Jan
> and Martin pointed out) within the poppler codebase. There are some good
> hooks in PDF (and now in poppler) to extract useful information like
> reading order.

I'll keep repeating even everyone keeps ignoring me that there is someone from 
the Abiword camp doing a pdf to xml outputdev using poppler and that this 
might be enough to get a odf file and if not enough a good start point. Tell 
me if you change mind and want to contact him.

> This should be able to be incorporated into okular as "export to ODF",
> although it would require some changes to the Okular::Generator class.

Not sure a "export to ODF" function is worth in okular, how many backends are 
we going to get that can do it?

Albert

> However I'm not sure what the "render to ODF" part is going to look like.
> Do you want to render a single page at a time? Do you want a different kind
> of rendering if the application is KWord, Krita or Karbon? Pass over a
> memory buffer or via a temp file?
>
> Martin pointed out some issues with fonts  - do you want to try to make the
> file editable (so we should try to substitute the font based on metrics),
> or do you want to try to make it look like the original (so we should
> render the text and pass over an image of the glyph.
>
> Note that it is also possible to have a PDF that is just an image of the
> page. Is it important to try to OCR the image?
>
> Brad
_______________________________________________
koffice-devel mailing list
koffice-devel@kde.org
https://mail.kde.org/mailman/listinfo/koffice-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic