[prev in list] [next in list] [prev in thread] [next in thread] 

List:       htmlunit-user
Subject:    [Htmlunit-user] Downloading images
From:       "Kellner, Matt" <mkellner () amazon ! com>
Date:       2011-09-08 20:00:37
Message-ID: D6805F312529F3438F0BEA6A6C152B844C0F46F7 () EX-IAD6-C ! ant ! amazon ! com
[Download RAW message or body]

Hi there.  I see from searching through old tickets and such that a conscio=
us decision was made to make WebClient NEVER download any image data.  The =
HTML page can contain an image tag, and if you want to download the actual =
image, you first have to get the HtmlImage element and call a specific meth=
od on that element to download the data.

I also see that numerous requests have been made to make this an option (in=
cluding this old ticket here: http://osdir.com/ml/java.htmlunit.devel/2007-=
01/msg00021.html).  So can I ask why there is no way to specify image downl=
oad behavior as an option?  Is there some log somewhere that shows that thi=
s option was considered and rejected?

The reason I ask is because if you have an image tag that carries an "onLoa=
d" event to trigger some JavaScript, that event will never fire unless you =
explicitly load the image "manually" through code.  In my case, there is a =
complex JavaScript method that fires on a different event, makes an XMLHttp=
Request to a server to grab new HTML code, populates that code in a cache v=
ariable, and attempts to load an image, waits for that image to load, then =
makes the new HTML visible on the page.  There is currently no way for me t=
o get this item to become visible on my page because the process is getting=
 blocked at the "load image" phase - since HtmlUnit doesn't load images, th=
e onLoad event just doesn't fire.  And I can't hook into that image tag bec=
ause there is no deterministic way to tell what its ID is.

It would be really nice if I could tell HtmlUnit to act like a real browser=
.  It is an excellent tool in almost every other way, but this one issue ma=
y result in my having to switch to using Selenium, which is a much more exp=
ensive (literally) solution in our case.

Thanks.

[Attachment #3 (text/html)]

<html xmlns:v="urn:schemas-microsoft-com:vml" \
xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" \
xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type \
content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 12 \
(filtered medium)"><style><!-- /* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:"Calibri","sans-serif";
	color:windowtext;}
.MsoChpDefault
	{mso-style-type:export-only;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div \
class=WordSection1><p class=MsoNormal>Hi there.&nbsp; I see from searching through \
old tickets and such that a conscious decision was made to make WebClient NEVER \
download any image data.&nbsp; The HTML page can contain an image tag, and if you \
want to download the actual image, you first have to get the HtmlImage element and \
call a specific method on that element to download the data.<o:p></o:p></p><p \
class=MsoNormal><o:p>&nbsp;</o:p></p><p class=MsoNormal>I also see that numerous \
requests have been made to make this an option (including this old ticket here: <a \
href="http://osdir.com/ml/java.htmlunit.devel/2007-01/msg00021.html">http://osdir.com/ml/java.htmlunit.devel/2007-01/msg00021.html</a>).&nbsp; \
So can I ask why there is no way to specify image download behavior as an \
option?&nbsp; Is there some log somewhere that shows that this option was considered \
and rejected?<o:p></o:p></p><p class=MsoNormal><o:p>&nbsp;</o:p></p><p \
class=MsoNormal>The reason I ask is because if you have an image tag that carries an \
&#8220;onLoad&#8221; event to trigger some JavaScript, that event will never fire \
unless you explicitly load the image &#8220;manually&#8221; through code.&nbsp; In my \
case, there is a complex JavaScript method that fires on a different event, makes an \
XMLHttpRequest to a server to grab new HTML code, populates that code in a cache \
variable, and attempts to load an image, waits for that image to load, then makes the \
new HTML visible on the page.&nbsp; There is currently no way for me to get this item \
to become visible on my page because the process is getting blocked at the \
&#8220;load image&#8221; phase &#8211; since HtmlUnit doesn&#8217;t load images, the \
onLoad event just doesn&#8217;t fire.&nbsp; And I can&#8217;t hook into that image \
tag because there is no deterministic way to tell what its ID is.<o:p></o:p></p><p \
class=MsoNormal><o:p>&nbsp;</o:p></p><p class=MsoNormal>It would be really nice if I \
could tell HtmlUnit to act like a real browser.&nbsp; It is an excellent tool in \
almost every other way, but this one issue may result in my having to switch to using \
Selenium, which is a much more expensive (literally) solution in our \
case.<o:p></o:p></p><p class=MsoNormal><o:p>&nbsp;</o:p></p><p \
class=MsoNormal>Thanks.<o:p></o:p></p></div></body></html>


[Attachment #4 (--===============2795667351125943111==)]
------------------------------------------------------------------------------
Doing More with Less: The Next Generation Virtual Desktop 
What are the key obstacles that have prevented many mid-market businesses
from deploying virtual desktops?   How do next-generation virtual desktops
provide companies an easier-to-deploy, easier-to-manage and more affordable
virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/

_______________________________________________
Htmlunit-user mailing list
Htmlunit-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/htmlunit-user


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic