[prev in list] [next in list] [prev in thread] [next in thread]
List: htmlunit-user
Subject: [Htmlunit-user] Fw: ArrayIndexOutOfBoundsException in
From: Ahmed Ashour <asashour () yahoo ! com>
Date: 2009-12-16 5:33:52
Message-ID: 545125.11583.qm () web112222 ! mail ! gq1 ! yahoo ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Hello Charles,
* You must subscribe to the list before posting.
* With 2.7-snapshot, I have a different error.
Anyway, please open a bug report.
Ahmed
----
Blog: http://asashour.blogspot.com
----- Forwarded Message ----
The attached message has been automatically discarded.
[Attachment #5 (text/html)]
<html><head><style type="text/css"><!-- DIV {margin:0px;} \
--></style></head><body><div style="font-family:times new roman, new york, times, \
serif;font-size:12pt"><DIV></DIV> <DIV>Hello Charles,</DIV>
<DIV> </DIV>
<DIV>* You must subscribe to the list before posting.</DIV>
<DIV>* With 2.7-snapshot, I have a different error.</DIV>
<DIV> </DIV>
<DIV>Anyway, please open a bug report.</DIV>
<DIV> </DIV>
<DIV>Ahmed<BR>----</DIV>
<DIV>Blog: <A href="http://asashour.blogspot.com">http://asashour.blogspot.com</A></DIV>
<DIV><FONT size=2 face=Tahoma></FONT> </DIV>
<DIV><FONT size=2 face=Tahoma>----- Forwarded Message ----</FONT><BR>The attached \
message has been automatically discarded.</DIV><!-- cg37.c4.mail.gq1.yahoo.com \
compressed/chunked Sat Dec 12 08:18:28 PST 2009 --></div><br>
</body></html>
--0-904140705-1260941632=:11583--
Received: from sfi-mx-4.v28.ch3.sourceforge.com ([172.29.28.124]
helo=mx.sourceforge.net)
by sfs-ml-4.v29.ch3.sourceforge.com with esmtp (Exim 4.69)
(envelope-from <chasdev@yahoo.com>) id 1NKXtt-0006Gk-Gu
for htmlunit-user@lists.sourceforge.net; Tue, 15 Dec 2009 13:57:17 +0000
X-ACL-Warn:
Received: from web56908.mail.re3.yahoo.com ([66.196.97.97])
by sfi-mx-4.v28.ch3.sourceforge.com with smtp (Exim 4.69)
id 1NKXtn-0005qj-9A
for htmlunit-user@lists.sourceforge.net; Tue, 15 Dec 2009 13:57:17 +0000
Received: (qmail 43225 invoked by uid 60001); 15 Dec 2009 13:57:05 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
t=1260885425; bh=W+CMBs/Xb3khpzhYaQHBQYihoHv2I4aNj60PPK12I/M=;
h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type;
b=Jpx9XIinAWspauF9oTpsA9IWhTc1tXITNR627ysRW+O9GG0vX4KB1llG0pE0j/8pJJQKFD89bObYi8S0ZE \
yaLBpZ53Qs5uIQdn8eV8ycT0VB0ADW5KIvvJ+o+84r7Fe4GLtjzCY1kkD7KJhCEoAFKZBnnfh+TYFezQTj3IadW30=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type;
b=Y+GpsaPUw3827pXQfTQ5NdPflM5fy42PyKmJJH/aDsq+sSgb76pli7XSz9VJ7l0agLqolxm6VVuHMfHDdi \
3NE7aWqck2TKBEAg0ah+KjH7HeEw+bt1nvvHPNf84XHxmLrMSesprYdsn/fGN06b6QYY4NJvZoDMrHkLHbPoUG0nQ=;
Message-ID: <439880.43019.qm@web56908.mail.re3.yahoo.com>
X-YMail-OSG: H9uhFjIVM1kS_DU5RGEqh03UD7oz9WAJGGCM10t9dHmf4RbjuG1IGcwI3fp81R7ouHonRvgXj \
zUOHvb3o7H4UJuxRzqk4qHhUv82t3IN8FhlIl0Hy7f5q5ZYmYWMpsV2OGXAs2QAOBru5yXeWBayhTVrxXXOFLY \
0UdPVL2CNbZaVxVxJ92tRQBN4MpsEFOzIFTbeb_kNbLREhjP0FqisujBde3FhBO7c1UI3lKUmtbiZjxM8tayRP \
fgsv5VqhWc.X.Z1OQ5ZwZpKj7KIIFsPoi0SLwss.KYXYLum3tIhcgNPX6HMjKLinzLOh5N.xB0CJJIbF4j1c2d \
AdroTIfQuL2PNsxBAA5EwChEXa80dfLVgiT.c4XCPQzjyuVz2P2AWDJtLoPfIjzRfpXkBfFZoOmmbWtR_JsIkG \
Gf.OeDSORZ0inBFBc1JyI6NgZKU_2l2zwikwUmOUlAZeLq8TQk2KEFGGeJuxHLiNJTk0W5fuYL8HBrbxtQwLnY8VA--
Received: from [41.185.88.124] by web56908.mail.re3.yahoo.com via HTTP;
Tue, 15 Dec 2009 05:57:05 PST
X-Mailer: YahooMailRC/240.3 YahooMailWebService/0.8.100.260964
Date: Tue, 15 Dec 2009 05:57:05 -0800 (PST)
From: Charles de Villiers <chasdev@yahoo.com>
Subject: ArrayIndexOutOfBoundsException in EncodingSniffer
To: htmlunit-user@lists.sourceforge.net
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Spam-Score: 0.0 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
See http://spamassassin.org/tag/ for more details.
-0.0 DKIM_VERIFIED Domain Keys Identified Mail: signature passes
verification
0.0 DKIM_SIGNED Domain Keys Identified Mail: message has a signature
X-Headers-End: 1NKXtn-0005qj-9A
Hi,
I am using HtmlUnit 2.6 to scrape a website:
http://infopost.spectraenergy.com/infopost/default.asp?pipe=AG
This site loads & works properly in IE and Firefox.
In HtmlUnit, one of the iframes fails to load (404 error) so I try to ignore it, by \
calling webClient.setThrowExceptionOnFailingStatusCode(false); before calling \
getPage(). (I'm not sure if this part is relevant). Anyway then I get another error:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 4099
at com.gargoylesoftware.htmlunit.util.EncodingSniffer.indexOfSubArray(EncodingSniffer.java:716)
at com.gargoylesoftware.htmlunit.util.EncodingSniffer.sniffEncodingFromMetaTag(EncodingSniffer.java:351)
at com.gargoylesoftware.htmlunit.util.EncodingSniffer.sniffHtmlEncoding(EncodingSniffer.java:225)
at com.gargoylesoftware.htmlunit.util.EncodingSniffer.sniffEncoding(EncodingSniffer.java:134)
at com.gargoylesoftware.htmlunit.WebResponseImpl.getContentCharsetOrNull(WebResponseImpl.java:179)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:329)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:304)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:134)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:101)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:447)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:330)
at com.gargoylesoftware.htmlunit.html.BaseFrame.loadInnerPageIfPossible(BaseFrame.java:125)
at com.gargoylesoftware.htmlunit.html.BaseFrame.loadInnerPage(BaseFrame.java:95)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadFrames(HtmlPage.java:1780)
at com.gargoylesoftware.htmlunit.html.HtmlPage.initialize(HtmlPage.java:176)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:454)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:330)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:387)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:372)
at GetPage.main(GetPage.java:21)
I'm not sure how to get around this one! Can anyone help please?
------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev
_______________________________________________
Htmlunit-user mailing list
Htmlunit-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/htmlunit-user
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic