[prev in list] [next in list] [prev in thread] [next in thread]
List: freedesktop-poppler
Subject: Re: [poppler] poppler-dump
From: Marco <ctxspi () gmail ! com>
Date: 2014-03-13 9:11:36
Message-ID: CAAVAo4OXzGriUUjP+VdwyjYOCOJAG_nNEH8rYvpvEGuH0zn4zA () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Il giorno 12/mar/2014 20:36, "Albert Astals Cid" <aacid@kde.org> ha scritto:
>
> >
> > El Dimecres, 12 de març de 2014, a les 20:25:45, Marco va escriure:
> > > Hi Albert
> > >
> > > Command 'pdftotext -layout filename.pdf -' it is the same if I use
> > > physical_layout in my small program, but if I have a pdf file with text
> > > into tables (I am sorry for my bad description), and I use command
> > > 'pdftotext filename.pdf -', it give a results that I cannot display
> using
> > > 'raw_order_layout' or 'physical_layout' in my program.
> >
> > I'd say it is the other way around, poppler-dump can't give you what
> -layout
> > does.
> >
> > Compare the code of poppler-page.cpp and pdftottext, it's pretty
> straight-
> > forward.
> >
> > Cheers,
> > Albert
> >
> > >
> > > 2014-03-12 19:49 GMT+01:00 Marco <ctxspi@gmail.com>:
> > > > Hi to all
> > > >
> > > > I am new user to poppler and I have a short question.
> > > >
> > > > In my small program I use these lines:
> > > >
> > > > for (int i = 0; i < pages; ++i) {
> > > >
> > > > cout << "Page " << (i + 1) << "/" << pages << ":" << endl;
> > > > auto_ptr<poppler::page> p(doc->create_page(i));
> > > > poppler::byte_array text_ba = p.get()->text(p->page_rect(),
> > > >
> > > > poppler::page::raw_order_layout).to_utf8();
> > > >
> > > > text_ba.push_back(0); // Add a NULL terminator for the C char *
> > > > string text( text_ba.begin(), text_ba.end() );
> > > > cout << text << endl;
> > > > }
> > > >
> > > > to print text of file pdf, but using 'raw_order_layout' or
> > > > 'physical_layout' the output is different if I use the command
> 'pdftotext
> > > > filename.pdf -'.
> > > >
> > > >
> > > > How I can show text (but written in a pointer of char) as command
> > > > 'pdftotext filename.pdf -' ?
> > > >
> > > > Thank
> >
>
> Albert I'am sorry for mail incovenient.
--
I have tried it more times but I need to have in output not ustring data
but string or pointer of chars.
I need to have utf8 charset but not in the ustring format.
[Attachment #5 (text/html)]
<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote"><span \
dir="ltr"></span><br><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <p>
Il giorno 12/mar/2014 20:36, "Albert Astals Cid" <<a \
href="mailto:aacid@kde.org" target="_blank">aacid@kde.org</a>> ha \
scritto:</p><div><div class="h5"><br> ><br>
> El Dimecres, 12 de març de 2014, a les 20:25:45, Marco va escriure:<br>
> > Hi Albert<br>
> ><br>
> > Command 'pdftotext -layout filename.pdf -' it is the same if I \
use<br> > > physical_layout in my small program, but if I have a pdf file with \
text<br> > > into tables (I am sorry for my bad description), and I use \
command<br> > > 'pdftotext filename.pdf -', it give a results that I \
cannot display using<br> > > 'raw_order_layout' or \
'physical_layout' in my program.<br> ><br>
> I'd say it is the other way around, poppler-dump can't give you what \
-layout<br> > does.<br>
><br>
> Compare the code of poppler-page.cpp and pdftottext, it's pretty \
straight-<br> > forward.<br>
><br>
> Cheers,<br>
> Albert<br>
><br>
> ><br>
> > 2014-03-12 19:49 GMT+01:00 Marco <<a href="mailto:ctxspi@gmail.com" \
target="_blank">ctxspi@gmail.com</a>>:<br> > > > Hi to all<br>
> > ><br>
> > > I am new user to poppler and I have a short question.<br>
> > ><br>
> > > In my small program I use these lines:<br>
> > ><br>
> > > for (int i = 0; i < pages; ++i) {<br>
> > ><br>
> > > cout << "Page " << (i + 1) << \
"/" << pages << ":" << endl;<br> > > > \
auto_ptr<poppler::page> p(doc->create_page(i));<br> > > > \
poppler::byte_array text_ba = p.get()->text(p->page_rect(),<br> > > \
><br> > > > poppler::page::raw_order_layout).to_utf8();<br>
> > ><br>
> > > text_ba.push_back(0); // Add a NULL terminator for the C char \
*<br> > > > string text( text_ba.begin(), text_ba.end() );<br>
> > > cout << text << endl;<br>
> > > }<br>
> > ><br>
> > > to print text of file pdf, but using 'raw_order_layout' or<br>
> > > 'physical_layout' the output is different if I use the command \
'pdftotext<br> > > > filename.pdf -'.<br>
> > ><br>
> > ><br>
> > > How I can show text (but written in a pointer of char) as command<br>
> > > 'pdftotext filename.pdf -' ?<br>
> > ><br>
> > > Thank<br>
><br>
</div></div><p></p>
</blockquote></div>Albert I'am sorry for mail incovenient.<br></div><div \
class="gmail_extra">--<br>I have tried it more times but I need to have in output not \
ustring data but string or pointer of chars.<br><br></div> <div class="gmail_extra">I \
need to have utf8 charset but not in the ustring format.<br></div><div \
class="gmail_extra"><br clear="all"><br><br> </div></div>
_______________________________________________
poppler mailing list
poppler@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/poppler
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic