[prev in list] [next in list] [prev in thread] [next in thread] 

List:       htmlunit-user
Subject:    Re: [Htmlunit-user] text of unknown page type
From:       Adolfo Rodriguez <pellyadolfo () yahoo ! es>
Date:       2013-08-12 7:46:21
Message-ID: 1376293581.12619.YahooMailNeo () web171604 ! mail ! ir2 ! yahoo ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I do:

Page jobPage = webClient.getPage(url);
String mimeType = jobPage.getWebResponse().getContentType();

to detect PDFs and other mimetypes.


HTH



> ________________________________
> De: "regs@airpost.net" <regs@airpost.net>
> Para: htmlunit-user@lists.sourceforge.net 
> Enviado: Lunes 12 de agosto de 2013 9:27
> Asunto: Re: [Htmlunit-user] text of unknown page type
> 
> 
> 
> 
> Hi,
> 
> 
> I use a if inside the addWebWindowListener to filter out html pages and other \
> content. 
> for instance:
> 
> 
> public void webWindowContentChanged(WebWindowEvent event) {
> 
> 
> 
> if (event.getNewPage() instanceof HtmlPage) {
> 
> blablabla
> 
> }
> 
> 
> to read/output the contents of the page I use asText():
> 
> 
> final HtmlPage page1 = webClient.getPage("http://dot.com");
> 
> System.out.println(page1.asText());
> 
> 
> On Sun, Aug 11, 2013, at 16:50, David Michael Gang wrote:
> 
> Hi all,
> > 
> > 
> > I want to get pages from certain web sites and i don't know in advance if it is \
> > an HtmlPage or another page. Is there a generic way to get the content of the \
> > page? If i try to assign a TextPage to a HtmlPage i get an exception. 
> > 
> > Thanks,
> > 
> > David
> > 
> > ------------------------------------------------------------------------------
> > 
> > Get 100% visibility into Java/.NET code with AppDynamics Lite!
> > 
> > It's a free troubleshooting tool designed for production.
> > 
> > Get down to code-level detail for bottlenecks, with <2% overhead. 
> > 
> > Download for free and get started troubleshooting in minutes. 
> > 
> > http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
> > 
> > _______________________________________________
> > 
> > Htmlunit-user mailing list
> > 
> > Htmlunit-user@lists.sourceforge.net
> > 
> > https://lists.sourceforge.net/lists/listinfo/htmlunit-user
> > 
> 
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite!
> It's a free troubleshooting tool designed for production.
> Get down to code-level detail for bottlenecks, with <2% overhead. 
> Download for free and get started troubleshooting in minutes. 
> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
> _______________________________________________
> Htmlunit-user mailing list
> Htmlunit-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/htmlunit-user
> 
> 
> 


[Attachment #5 (text/html)]

<html><body><div style="color:#000; background-color:#fff; font-family:arial, \
helvetica, sans-serif;font-size:12pt"><div style="font-family: arial, helvetica, \
sans-serif; font-size: 12pt;"><span>I do:</span></div><div style="font-family: arial, \
helvetica, sans-serif; font-size: 15.555556297302246px; color: rgb(0, 0, 0); \
background-color: transparent; font-style: normal;"><span><br></span></div><div \
style="background-color: transparent;"><span class="Apple-tab-span" \
style="white-space: pre;">			</span>Page jobPage = webClient.getPage(url);</div><div \
style="background-color: transparent;"><span></span></div><div \
style="background-color: transparent;"><span class="Apple-tab-span" \
style="white-space: pre;">			</span>String mimeType = \
jobPage.getWebResponse().getContentType();</div><div style="background-color: \
transparent; color: rgb(0, 0, 0); font-size: 15.555556297302246px; font-family: \
arial, helvetica, sans-serif; font-style: normal;"><br></div><div  \
style="background-color: transparent; color: rgb(0, 0, 0); font-size: \
15.555556297302246px; font-family: arial, helvetica, sans-serif; font-style: \
normal;">to detect PDFs and other mimetypes.</div><div style="background-color: \
transparent; color: rgb(0, 0, 0); font-size: 15.555556297302246px; font-family: \
arial, helvetica, sans-serif; font-style: normal;"><br></div><div \
style="background-color: transparent; color: rgb(0, 0, 0); font-size: \
15.555556297302246px; font-family: arial, helvetica, sans-serif; font-style: \
normal;"><br>HTH</div><div style="font-family: arial, helvetica, sans-serif; \
font-size: 12pt;"><br><blockquote style="border-left: 2px solid rgb(16, 16, 255); \
margin-left: 5px; margin-top: 5px; padding-left: 5px;">  <div style="font-family: \
arial, helvetica, sans-serif; font-size: 12pt;"> <div style="font-family: 'times new \
roman', 'new york', times, serif; font-size: 12pt;"> <div dir="ltr"> <hr size="1">  \
<font size="2" face="Arial"> <b><span  style="font-weight:bold;">De:</span></b> \
"regs@airpost.net" &lt;regs@airpost.net&gt;<br> <b><span style="font-weight: \
bold;">Para:</span></b> htmlunit-user@lists.sourceforge.net <br> <b><span \
style="font-weight: bold;">Enviado:</span></b> Lunes 12 de agosto de 2013 9:27<br> \
<b><span style="font-weight: bold;">Asunto:</span></b> Re: [Htmlunit-user] text of \
unknown page type<br> </font> </div> <div class="y_msg_container"><br><div \
id="yiv2383911716">


<title></title>

<div><div>Hi,<br></div>
<div>&nbsp;</div>
<div>I use a if inside the addWebWindowListener to filter out html pages and other \
content.<br></div> <div>for instance:<br></div>
<div>&nbsp;</div>
<div> public void webWindowContentChanged(WebWindowEvent event) {<br></div>
<div>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br></div>
<div>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp; if (event.getNewPage() instanceof \
HtmlPage) {<br></div> <div>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; blablabla<br></div> <div>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp;&nbsp; }<br></div> <div>&nbsp;</div>
<div>to read/output the contents of the page I use asText():<br></div>
<div>&nbsp;</div>
<div>final HtmlPage page1 = webClient.getPage("<a rel="nofollow" target="_blank"  \
href="http://10.250.5.134:8002/autosys/arbeidsflate/standard/index.html">http://dot.com</a>");<br></div>
 <div>System.out.println(page1.asText());<br></div>
<div>&nbsp;</div>
<div>On Sun, Aug 11, 2013, at 16:50, David Michael Gang wrote:<br></div>
<blockquote type="cite"><div dir="ltr"><div><div>Hi all,<br></div>
<div>&nbsp;</div>
<div>I want to get pages from certain web sites and i don't know in advance if it is \
an HtmlPage or another page. Is there a generic way to get the content of the page? \
If i try to assign a TextPage to a HtmlPage i get an exception.<br></div> \
<div>&nbsp;</div> <div>Thanks,<br></div>
</div>
<div>David<br></div>
</div>
<div>------------------------------------------------------------------------------<br></div>
 <div>Get 100% visibility into Java/.NET code with AppDynamics Lite!<br></div>
<div>It's a free troubleshooting tool designed for production.<br></div>
<div>Get down to code-level detail for bottlenecks, with &lt;2% overhead. <br></div>
<div>Download for free and get started troubleshooting in minutes. <br></div>
<div><a rel="nofollow" target="_blank" \
href="http://pubads.g.doubleclick.net/gampad/clk?id=48897031&amp;iu=/4140/ostg.clktrk" \
>http://pubads.g.doubleclick.net/gampad/clk?id=48897031&amp;iu=/4140/ostg.clktrk</a><br></div>
> 
<div><u>_______________________________________________</u><br></div>
<div>Htmlunit-user mailing list<br></div>
<div><a rel="nofollow" ymailto="mailto:Htmlunit-user@lists.sourceforge.net" \
target="_blank" href="mailto:Htmlunit-user@lists.sourceforge.net">Htmlunit-user@lists.sourceforge.net</a><br></div>
 <div><a rel="nofollow" target="_blank" \
href="https://lists.sourceforge.net/lists/listinfo/htmlunit-user">https://lists.sourceforge.net/lists/listinfo/htmlunit-user</a><br></div>
 </blockquote><div>&nbsp;</div>
</div>

</div><br>------------------------------------------------------------------------------<br>Get \
100% visibility into Java/.NET code with AppDynamics Lite!<br>It's a free \
troubleshooting tool designed for production.<br>Get down to code-level detail for \
bottlenecks, with &lt;2% overhead. <br>Download for free and get started \
troubleshooting in minutes. <br><a \
href="http://pubads.g.doubleclick.net/gampad/clk?id=48897031&amp;iu=/4140/ostg.clktrk" \
target="_blank">http://pubads.g.doubleclick.net/gampad/clk?id=48897031&amp;iu=/4140/os \
tg.clktrk</a><br>_______________________________________________<br>Htmlunit-user \
mailing list<br><a ymailto="mailto:Htmlunit-user@lists.sourceforge.net" \
href="mailto:Htmlunit-user@lists.sourceforge.net">Htmlunit-user@lists.sourceforge.net</a><br><a \
href="https://lists.sourceforge.net/lists/listinfo/htmlunit-user" \
target="_blank">https://lists.sourceforge.net/lists/listinfo/htmlunit-user</a><br><br><br></div> \
</div> </div>  </blockquote></div>   </div></body></html>



------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk

_______________________________________________
Htmlunit-user mailing list
Htmlunit-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/htmlunit-user


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic