[prev in list] [next in list] [prev in thread] [next in thread]
List: freedesktop-poppler
Subject: Re: [poppler] poppler-dump
From: Marco <ctxspi () gmail ! com>
Date: 2014-03-13 16:32:30
Message-ID: CAAVAo4Pa6EvP1FJo+1tAyseg6op4Lrn_eMDtJZy98t=MZS_ZDA () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Solved
I use bbox option in pdftotext.
Thanks to all
2014-03-13 13:07 GMT+01:00 Marco <ctxspi@gmail.com>:
> I have found rules in TextOutputDev.cc inside function 'void
> TextPage::dump(void *outputStream, TextOutputFunc outputFunc, GBool
> physLayout) { ... }' that give me right layout (for me) and code is:
>
> ...
> } else {
> for (flow = flows; flow; flow = flow->next) {
> for (blk = flow->blocks; blk; blk = blk->next) {
> for (line = blk->lines; line; line = line->next) {
> n = line->len;
> if (line->hyphenated && (line->next || blk->next)) {
> --n;
> }
> s = new GooString();
> dumpFragment(line->text, n, uMap, s);
> (*outputFunc)(outputStream, s->getCString(), s->getLength());
> delete s;
> // output a newline when a hyphen is not suppressed
> if (n == line->len) {
> (*outputFunc)(outputStream, eol, eolLen);
> }
> }
> }
> (*outputFunc)(outputStream, eol, eolLen);
> }
> }
>
> Do you know if same method can be imported for poppler-dump.cc?
>
> Please can you explained me howto?
>
> As you can see in my first mail problem to print in string was solved.
>
>
> 2014-03-13 12:30 GMT+01:00 Marco <ctxspi@gmail.com>:
>
> Hi Brad
>>
>> I think that the main problem is that poppler-cpp library cannot print
>> pdf file as same mode of pdftotext (command without any layout option).
>>
>>
>> 2014-03-13 10:20 GMT+01:00 Brad Hards <bradh@frogmouth.net>:
>>
>> On Thu, 13 Mar 2014 10:11:36 AM Marco wrote:
>>> > I have tried it more times but I need to have in output not ustring
>>> data
>>> > but string or pointer of chars.
>>> >
>>> > I need to have utf8 charset but not in the ustring format.
>>> From cpp/poppler-global.h header:
>>>
>>> class POPPLER_CPP_EXPORT ustring : public std::basic_string<unsigned
>>> short>
>>> {
>>> public:
>>> ustring();
>>> ustring(size_type len, value_type ch);
>>> ~ustring();
>>>
>>> byte_array to_utf8() const;
>>> std::string to_latin1() const;
>>>
>>> static ustring from_utf8(const char *str, int len = -1);
>>> static ustring from_latin1(const std::string &str);
>>> ...
>>> }
>>>
>>>
>>
>>
>> --
>> E' meglio coltivare GNU/Linux... tanto Windows si pianta da solo!!
>>
>
>
>
> --
> E' meglio coltivare GNU/Linux... tanto Windows si pianta da solo!!
>
--
E' meglio coltivare GNU/Linux... tanto Windows si pianta da solo!!
[Attachment #5 (text/html)]
<div dir="ltr"><div><div>Solved<br><br></div>I use bbox option in \
pdftotext.<br><br></div>Thanks to all<br></div><div class="gmail_extra"><br><br><div \
class="gmail_quote">2014-03-13 13:07 GMT+01:00 Marco <span dir="ltr"><<a \
href="mailto:ctxspi@gmail.com" target="_blank">ctxspi@gmail.com</a>></span>:<br> \
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr"><div>I have found rules in TextOutputDev.cc \
inside function 'void TextPage::dump(void *outputStream, TextOutputFunc \
outputFunc, GBool physLayout) { ... }' that give me right layout (for me) and \
code is:<br>
<br>...<br></div>} else {<br><div> for (flow = flows; flow; flow = flow->next) \
{<br> for (blk = flow->blocks; blk; blk = blk->next) {<br> for \
(line = blk->lines; line; line = line->next) {<br>
n = line->len;<br> if (line->hyphenated && \
(line->next || blk->next)) {<br> --n;<br> }<br> s \
= new GooString();<br> dumpFragment(line->text, n, uMap, s);<br>
(*outputFunc)(outputStream, s->getCString(), s->getLength());<br> \
delete s;<br> // output a newline when a hyphen is not suppressed<br> \
if (n == line->len) {<br> (*outputFunc)(outputStream, eol, eolLen);<br>
}<br> }<br> }<br> (*outputFunc)(outputStream, eol, \
eolLen);<br> }<br> }<br><br></div><div>Do you know if same method can be imported \
for poppler-dump.cc?<br><br></div><div>Please can you explained me howto?<br>
<br></div><div>As you can see in my first mail problem to print in string was \
solved.<br></div></div><div class="gmail_extra"><br><br><div \
class="gmail_quote">2014-03-13 12:30 GMT+01:00 Marco <span dir="ltr"><<a \
href="mailto:ctxspi@gmail.com" target="_blank">ctxspi@gmail.com</a>></span>:<div> \
<div class="h5"><br> <blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi \
Brad<br><br></div>I think that the main problem is that poppler-cpp library cannot \
print pdf file as same mode of pdftotext (command without any layout option).<br>
</div><div class="gmail_extra"><br>
<br><div class="gmail_quote">2014-03-13 10:20 GMT+01:00 Brad Hards <span \
dir="ltr"><<a href="mailto:bradh@frogmouth.net" \
target="_blank">bradh@frogmouth.net</a>></span>:<div><div><br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">
<div>On Thu, 13 Mar 2014 10:11:36 AM Marco wrote:<br>
> I have tried it more times but I need to have in output not ustring data<br>
> but string or pointer of chars.<br>
><br>
> I need to have utf8 charset but not in the ustring format.<br>
</div>From cpp/poppler-global.h header:<br>
<br>
class POPPLER_CPP_EXPORT ustring : public std::basic_string<unsigned short><br>
{<br>
public:<br>
ustring();<br>
ustring(size_type len, value_type ch);<br>
~ustring();<br>
<br>
byte_array to_utf8() const;<br>
std::string to_latin1() const;<br>
<br>
static ustring from_utf8(const char *str, int len = -1);<br>
static ustring from_latin1(const std::string &str);<br>
...<br>
}<br>
<br>
</blockquote></div></div></div><br><br clear="all"><div><br>-- <br>E' meglio \
coltivare GNU/Linux... tanto Windows si pianta da solo!!<br> </div></div>
</blockquote></div></div></div><div><div class="h5"><br><br clear="all"><br>-- \
<br>E' meglio coltivare GNU/Linux... tanto Windows si pianta da solo!!<br> \
</div></div></div> </blockquote></div><br><br clear="all"><br>-- <br>E' meglio \
coltivare GNU/Linux... tanto Windows si pianta da solo!!<br> </div>
_______________________________________________
poppler mailing list
poppler@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/poppler
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic