[prev in list] [next in list] [prev in thread] [next in thread]
List: freedesktop-poppler
Subject: Re: [poppler] No text extracted by pdftohtml
From: Albert Astals Cid <aacid () kde ! org>
Date: 2010-05-26 19:17:34
Message-ID: 201005262017.35011.aacid () kde ! org
[Download RAW message or body]
A Diumenge, 9 de maig de 2010, Jaime Gómez Obregón va escriure:
> Hi everybody,
>
> It seems poppler is being unable to extract text in some PDF files:
>
> http://iteisa.com/tmp/poppler-sample.pdf (11 Mb)
>
> pdftohtml from poppler 0.12.4 and 0.12.2 is not able to extract the
> text, and evince shows the document correctly but it's unable to select
> it's text. However acroread shows and selects the text correctly (so
> it's normal, editable text and not an image).
>
> Is it normal? Is there any workaround for this?
>
> Everything seems ok with the file:
>
> $ pdfinfo poppler-sample.pdf
> Title: untitled
> Creator: Adobe InDesign CS4 (6.0.4)
> Producer: Acrobat Distiller 9.0.0 (Windows)
> CreationDate: Wed May 5 09:35:12 2010
> ModDate: Wed May 5 09:35:12 2010
> Tagged: no
> Pages: 208
> Encrypted: no
> Page size: 595.276 x 841.89 pts (A4)
> File size: 10536602 bytes
> Optimized: no
> PDF version: 1.4
>
> Best regards,
Please file a bug.
Albert
_______________________________________________
poppler mailing list
poppler@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/poppler
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic