'Re: [poppler] No text extracted by pdftohtml'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freedesktop-poppler
Subject:    Re: [poppler] No text extracted by pdftohtml
From:       Albert Astals Cid <aacid () kde ! org>
Date:       2010-05-26 19:17:34
Message-ID: 201005262017.35011.aacid () kde ! org
[Download RAW message or body]

A Diumenge, 9 de maig de 2010, Jaime Gómez Obregón va escriure:
> Hi everybody,
> 
> It seems poppler is being unable to extract text in some PDF files:
> 
> http://iteisa.com/tmp/poppler-sample.pdf (11 Mb)
> 
> pdftohtml from poppler 0.12.4 and 0.12.2 is not able to extract the
> text, and evince shows the document correctly but it's unable to select
> it's text. However acroread shows and selects the text correctly (so
> it's normal, editable text and not an image).
> 
> Is it normal? Is there any workaround for this?
> 
> Everything seems ok with the file:
> 
> $ pdfinfo poppler-sample.pdf
> Title:          untitled
> Creator:        Adobe InDesign CS4 (6.0.4)
> Producer:       Acrobat Distiller 9.0.0 (Windows)
> CreationDate:   Wed May  5 09:35:12 2010
> ModDate:        Wed May  5 09:35:12 2010
> Tagged:         no
> Pages:          208
> Encrypted:      no
> Page size:      595.276 x 841.89 pts (A4)
> File size:      10536602 bytes
> Optimized:      no
> PDF version:    1.4
> 
> Best regards,

Please file a bug.

Albert
_______________________________________________
poppler mailing list
poppler@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/poppler

[prev in list] [next in list] [prev in thread] [next in thread]